Question 1

Note that the problem does not place any restrictions on the distance function: the distance function may assign any arbitrary value d(c,l) to the pair (c,l), regardless of the value assigned to other pairs. In particular, the distances to not need to correspond to our "intuitive" notion of distance. For example, consider the following distance function:

d(c_1, l_1)=1
d(c_2, l_1)=100
d(c_1, l_2)=1
d(c_2, l_2)=1

These distances do not correspond to our intuitive notion of "distance": there is a path c_2, l_2, c_1, l_1 with total distance 3, and yet d(c_2, l_1) is 100. Functions like this are allowed in the definition of the problem in Question 1.

Question 2

In class we defined PARTITION to have a set of integers as its input. You may assume that the version in which all the numbers are non-negative is also NP-complete. That is, for question 2 you can use the following language PARTITION+:

PARTITION+
Instance: Natural numbers p_1,...,p_n
Question: Does there exist a subset S of the numbers, such that the numbers in S have the same sum as the numbers not in S?

Question 4

It was pointed out to me that I didn't give you the necessary background for solving question 4 properly. As with the example of CLIQUE-OPT we did in lecture, the first step must be to determine the optimal value of B for which there exists a solution. In the case of CLIQUE-OPT, we simply did a linear search, querying the decision program on (G,1),(G,2),(G,3),... until we find the largest value of k such that it accepts (G,k): then k is the size of the largest clique in G. However, if you do this for problem 4 the number of steps will not be polynomial in the input size. Below is an example of a similar problem, and a solution that runs in polynomial time. In short, the idea is to use binary search. In the case of CLIQUE-OPT, this results in log(n) steps instead of a linear number.

For this example, the optimization problem is SUBSET-MAX-SUM-OPT: given integers s_1,...,s_n, and t, find a subset of {s_1,...,s_n} with maximum sum, subject to the constraint that the sum is at most t. The corresponding decision problem is: given s_1,...,s_n,t, and an integer b, decide whether there exists a subset of {s_1,...,s_n} whose sum is at least b and at most t. We are trying to prove that, given a polynomial time algorithm for the decision problem, we can solve the optimization problem in polynomial time.

Let's say M is a polynomial time algorithm for the decision problem. First, run M on (s_1,...,s_n,t,0) to make sure that there is a subset that sums to t or less (if M rejects this input then return "no solution"). Next we do a binary search on b in the range [0,t] to find the maximum value for which M accepts (s_1,...,s_n,t,b). Note that this takes O(log(t)) steps, which is polynomial in the input size. However, if we had done a linear search by running M on (s_1,...,s_n,t,1),(s_1,...,s_n,t,2), and so on, then the number of steps would have been linear in t, which is not polynomial time (since t is represented by O(log(t)) bits in the input).

I expect that you all know how binary search works, but you don't need to write down the details of the search in your solution. It is enough to say "use binary search". The important thing is that you need to specify a range {a,a+1,a+2,\ldots,b} in which to search (i.e. a lower bound and an upper bound on the possible values that the optimum can have). The number of steps required by the binary search is then O(log(b-a)).

Note that in the above example, the reason why binary search works is that, if we are convinced that the optimal value lies in the range [x,y] then we can run the decision algorithm on (s_1,...,s_n,t,z), where z=(y-x)/2. If the algorithm accepts we are then convinced that the optimal value is at least z, so we can narrow the range to [z,y]. If it rejects then we are convinced that the optimal value is smaller than z, and we can narrow the range to [x,z-1]. In both cases the range is cut in half, so after O(log(t)) iterations we have shrunk the range [0,t] down to [b,b], at which point b must be the optimal value.