================================= WEEK 6 ==================================== Bisection method ---------------- * This a special case of the bracketing method where we cut the given interval, say [a,b], in half in each of the iterations (Step (1) in the general algorithm above). * Recall the Intermediate Value Theorem (IVT): If f is continuous on some interval [a,b] and f(a) != f(b), then f takes on every value between f(a) and f(b) in interval [a,b]. * In particular, as already seen above, if we know an interval [a,b] on which f is continuous and such that f(a) < 0 and f(b) > 0 (or the other way around), then we know that f has a root inside [a,b]. * How can we use the IVT to help us find a root? Idea: if we know f has a root in [a,b], check the midpoint m = (a+b)/2 and determine if there is a root in [a,m] or in [m,b]. This is somewhat like a "binary search" for the root. * How do we know which subinterval to pick? Check the signs! Pick [a,m] if f(a) and f(m) have different signs; otherwise, pick [m,b]. (Different signs? Check f(a)*f(m) < 0.) * Just keep doing this until the size of the interval is "small enough". What does that mean? We must be given a `tolerance parameter' t > 0 which tells us when we can stop: as soon as the size of the interval becomes smaller than t. Convergence and Error Analysis for the bisection method -------------------------------------------------------- * Approximation error: If we let a_i, b_i, m_i represent the values of a, b, and m after the i-th iteration of the algorithm, then what do we know? Well, there *is* a real root in the interval [a_i,b_i] so if we return m_i = (a_i + b_i) / 2 as the approximate root, we know it cannot be more than 1/2 * (b_i - a_i) away from the real root. So the error e_i is at most (b_i-a_i)/2. Let - r be the root, - x_i the midpoint of the i-th interval, and - e_i the approximation error at the i-th iteration. Then |e_i| = |x_i - r| < (1/2)(b_(n-1)-a_(n-1)) = [1/2^n](b-a) So |e_i+1| ~ 1/2 |e_i| ("~" is short for approximately). * How many steps needed if we want 10^(-9) approximation error assuming (b - a) = 1? Answer: 1/2^n <= 10^(-9) => n ~ 30; thus about 30 iterations are needed. * Convergence: because the size of the interval is halved at each step, and the error cannot be larger than the size of the interval, it follows that the error converges at a linear rate. * The bisection method has the advantage that it *always* converges, but the disadvantage that it converges slowly. Newton's Method --------------- * Use the tangent to the curve at current estimate of the root. * Suppose that we are looking for a root of some differentiable function f(x), and that we have some initial "guess" x for the value of a root. Then the function f is approximated by y = f(x_k) + f'(x_k)(x-x_k), which is the straight line generated by the tangent to f at x_k (see picture). If we set y = 0 and solve the equation above for x, we get a Newton estimate for the root. From the formula above, we get a general iterative form, which is x_k+1 = x_k - f(x_k)/f'(x_k). * Equivalently, if we write the Taylor series of order 1 for f around the root r, we get: f(r) ~ f(x) + (r - x) f'(x) (where I use the notation "~" to mean "approximately equal"). But by definition, we know that f(r) = 0, so we can solve for r in the previous equation to get: r ~ x - f(x) / f'(x) This suggests that we could use the RHS of the formula above to "improve" our guess x by compute a new approximation for r based on x. Note that we can get the same formula geometrically, by approximating the function f around x to find out where it intersects the x-axis. [See Picture drawn in class] * From the ideas above, we get the following algorithm: starting from an initial guess x_0, compute the successive approximations x_1, x_2, ... using the formula x_{i+1} = x_i - f(x_i) / f'(x_i). When do we stop? We could use two stopping conditions: (1) stop when f(x_i) gets "close enough" to 0 (which hopefully means that x_i is "close enough" to a root), or (2) stop when x_{i+1} and x_i get "close enough", which means that there is very little change from one value to the next, so the values have hopefully converged to a root. Ideally, we want to stop when the values get "close enough" to the actual root, i.e., when the relative error between the computed approximations and the real root gets smaller than some specified "tolerance". If we don't know the root (which is the whole point of Newston's method in the first place), then we have to use the approximate relative error instead of the relative error, so we stop when successive values get "close enough", i.e., when the relative change from one value to the next gets smaller than a certain tolerance. * We would expect that each approximation is closer to a root than the previous one, but can we show this rigorously? Convergence and Error Analysis for Newton's Method -------------------------------------------------- * From Taylor's series for f around the root r, it is possible to show that (r - x_{i+1}) = - f''(r)/2f'(r) * (r - x_i)^2 as long as the method converges. That is, the error for x_{i+1} is proportional to the square of the error for x_i, which is a good thing because as soon as x_i gets close enough to the root, it means that the number of significant digits will approximately double at every step. * This result only holds if the sequence of approximations *does* converge to a root r. Can we always guarantee this? * Graphically, it is easy to come up with situations where Newton's method performs poorly. [Pictures] All of these examples have one thing in common: they involve a function f whose derivative f' has points where it is equal (or nearly equal) to zero. It is possible to make a general statement, that if f(x) is a "convex" or "concave" function whose derivative does not equal zero anywhere inside some interval [a,b] such that f contains a root in [a,b], then Newton's method will converge starting from any point in [a,b] (in which case, by the analysis above, it will converge quadratically). The Secant method ----------------- * In order to be able to apply Newton's method, we have to know f' (and to compute it at every point). What if we don't know the derivative, or we want to minimize the number of function evaluations? * We know that the derivative is defined as a limit, as follows: f'(x) = lim [f(x+h) - f(x)] / h h->0 Since we want to compute the derivative in the context of finding a sequence of approximations to a root, we would expect that successive approximations will be quite close to each other, so that we can approximate the derivative itself as follows: f'(x_i) ~ [f(x_i) - f(x_{i-1})] / [x_i - x_{i-1}] If we substitute this back into the equation for Newton's method, we get: f(x_i) * (x_i - x_{i-1}) x_{i+1} = x_i - ------------------------ f(x_i) - f(x_{i-1}) f(x_i) * x_{i-1} - f(x_{i-1}) * x_i = ----------------------------------- f(x_i) - f(x_{i-1}) * The "secant method" works exactly like Newton's method, except that we need *two* initial guesses x_0 and x_1, and each successive value after that is computed using the formula above. * As for Newton's method, we can use different stopping conditions: usually, we stop when the approximate relative error falls below some specified tolerance. * From the equation for the secant method, it is possible to derive the following bound on the approximate error (we won't go through the details): (r - x_{i+1}) = - f''(r)/2f'(r) * (r - x_i) * (r - x_{i-1}) With some further algebra, it is possible to show that this is essentially equivalent to having a convergence rate of order 1.618... (instead of order 2 for Newton's method).