Assignment 5¶

Due Date: April 5, 8:59pm

Please see the guidelines at https://www.cs.toronto.edu/~lczhang/338/homework.html

What to Hand In¶

Please hand in 2 files:

Python file containing all your code, named csc338_a5.py. If you are using Jupyter Notebook to complete the work, your notebook can be exported as a .py file (File -> Download As -> Python). Your code will be auto-graded using Python 3.6, so please make sure that your code runs. There will be a 20% penalty if you need a remark due to small issues that render your code untestable.
PDF file named csc338_a5_written.pdf containing your solutions to the written parts of the assignment. Your solutions can be hand-written, but must be legible. Graders may deduct marks for illegible or poorly presented solutions.

Submit the assignment on MarkUs by 9pm on the due date. See the syllabus for the course policy regarding late assignments. All assignments must be done individually.

import math
import numpy as np

Q1. Golden Section Search [12 pt]¶

Part (a) [8 pt]¶

Write a function golden_section_search that uses the Golden Section search to find a minima of a function f on the interval [a, b]. You may assume that the function f is unimodal on [a, b].

The function should evaluate f only as many times as necessary. There will be a penalty for calling f more times than necessary.

Refer to algo 6.1 in the textbook, or slide 19 of the posted slides accompanying the textbook.

def golden_section_search(f, a, b, n=10):
    """ Return the Golden Section search intervals generated in 
    an attempt to find a minima of a function `f` on the interval
    `[a, b]`. The returned list should have length `n+1`.
   
    Do not evaluate the function `f` more times than necessary.
    
    Example: (as always, these are for illustrative purposes only)

    >>> golden_section_search(lambda x: x**2 + 2*x + 3, -2, 2, n=5)
    [(-2, 2),
     (-2, 0.4721359549995796),
     (-2, -0.4721359549995796),
     (-1.4164078649987384, -0.4721359549995796),
     (-1.4164078649987384, -0.8328157299974766),
     (-1.1934955049953735, -0.8328157299974766)]
    """

Part (b) [4 pt]¶

Consider the following functions, both of which are unimodal on [0, 2]

$f_1 (x) = 2 x^2 - 2x + 3$

$f_2 (x) = - x e^{-x^2}$

Run the golden section search on the two functions for $n=30$ iterations.

Save the returned lists in the variables golden_f1 and golden_f2

def f1(x): 
    return 2 * x**2 - 2*x + 3

golden_f1 = None

def f2(x): 
    return - x * np.exp(-x*x)

golden_f2 = None

Q2. Newton's Method in One Dimension [20 pt]¶

Part (a) [8 pt]¶

Implement Newton's Method to find a minima of a real function in one dimension.

def newton_1d(f, df, ddf, x, n=10):
    """ Return the list of iterates computed when using 
    Newton's Method to find a local minimum of a function `f`,
    starting from the point `x`. The parameter `df` is the 
    derivative of `f`. The parameter `ddf` is the second 
    derivative of `f`. The length of the returned list 
    should be `n+1`.
    
    Example: (as always, these are for illustrative purposes only)
    
    >>> newton_1d(lambda x: x**3, 
    ...           lambda x: 3 * (x**2),
    ...           lambda x: 6 * x, 
    ...           x=1.0, n=5)
    [1, 0.5, 0.25, 0.125, 0.0625, 0.03125]
    """
    return []

Part (b) [4 pt]¶

Consider f1 and f2 from Question 1 Part (b).

Complete the functions df1, df2 which compute the derivatives of f1 and f2.

Complete the functions ddf1, ddf2 which compute the second derivatives of f1 and f2.

def df1(x):
    return None
def ddf1(x):
    return None

def df2(x):
    return None
def ddf2(x):
    return None

Part (c) [4 pt]¶

Run Newton's Method on the two functions $f_1$ and $f_2$ for $n=30$ iterations, starting at $x = 1$.

Save the returned lists in the variables newton_f1 and newton_f2.

newton_f1 = None
newton_f2 = None

Part (d) [4 pt]¶

Newton's method should have a quadratic convergence rate. Do you observe a quadratic convergence rate in newton_f1 and newton_f2?

Explain your observation and justification in your PDF writeup.

Q3. Matrix Types [16 pt]¶

Part (a) [6 pt]¶

Write a function to check if a symmetric matrix is positive definite, negative definite, or indefinite using the Cholesky factorization. Use the Cholesky Factorization algorithm we discussed in class. You might find your Assignment 3 code to be helpful.

def is_positive_definite(M):
    """ Return True iff the M is positive definite. That is, 
    if the Cholesky Factorization of M is successful.
    
    Precondition: M is symmetric
    
    Examples:
    >>> is_positive_definite(np.array([[3., 1.], [1., 3.]]))
    True
    >>> is_positive_definite(np.array([[3., 1.], [1., -3.]]))
    False    
    """

Part (b) [4 pt]¶

Write a function matrix_type to check whether a matrix is positive definite, negative definite, or indefinite. You can use the is_positive_definite function you wrote in part (a) as a helper function.

# use these constants below to prevent mispelling
POS_DEF = "positive definite"
NEG_DEF = "negative definite"
INDEF = "indefinite"

def matrix_type(M):
    """ Return:
      * "positive definite" is M is positive definite,
      * "negative definite" if M is negative definite,
      * "indefinite" if neither of the two holds
    
    Precondition: M is non-singular and symmetric
    
    >>> matrix_type(np.array([[3., 1.], [1., 3.]])) == POS_DEF
    True
    """

Part (b) [3 pt]¶

Use the function you wrote in part (b) to determine whether each of the following matrices M1, M2, and M3 is positive definite, negative definite, or indefinite.

Store the values corresponding to constants POS_DEF, NEG_DEF, or INDEF in the variables q3b_M1, q3b_M2, q3b_M3.

M1 = np.array([[3., 2., 1.],
               [2., 5., -1.],
               [1., -1., 2.]])
M2 = np.array([[-5., 3., 1.],
               [3., -5., -2.],
               [1., -2., -2.]])
M3 = np.array([[3., -2., 1.],
               [-2., 1., -1.],
               [1., -1., 2.]])

q3b_M1 = None
q3b_M2 = None
q3b_M3 = None

# Example
M0 = np.array([[1., 0., 0.],
               [0., 1., 0.],
               [0., 0., 2.]])
q3b_M0 = POS_DEF

Part (c) [3 pt]¶

Determine, by hand, whether each of the following matrices M4, M5, M6, is positive definite, negative definite, or indefinite.

Store the values corresponding to constants POS_DEF, NEG_DEF, or INDEF in the variables q3b_M4, q3b_M5, q3b_M6.

M4 = np.array([[2, 1],
               [1, 2] ])
M5 = np.array([[-1, 1],
               [1, -2]])
M6 = np.array([[4, -1],
               [-1, 4]])

q3b_M4 = None
q3b_M5 = None
q3b_M6 = None

Q4. Optimization in Multiple Dimensions [32 pt]¶

Consider the function

$$f(x_0, x_1) = 2x_0 ^4 + 3 x_1^4 - x_0^2 x_1^2 - x_0^2 - 4 x_1^2 + 7 $$

Part (a) [4 pt]¶

Derive the gradient. Include your solution in your PDF writeup.

Part (b) [4 pt]¶

Derive the Hessian. Include your solution in your PDF writeup.

Part (c) [8 pt]¶

Write a function steepest_descent_f that uses a variation of the steepest descent method to solve for a local minimum of $f(x_0, x_1)$ from part (a). Instead of performing a line search as in Algorithm 6.3, the parameter $\alpha$ will be provided to you as a parameter. Likewise, the initial guess $(x_0, x_1)$ will be provided.

Use (1, 1) as the initial value and perform 10 iterations of the steepest descent variant. Save the result in the variable q4c_steepest. (The result q4c_steepest should be a list of length 11)

def steepest_descent_f(init_x0, init_x1, alpha, n=5):
    """ Return the $n$ steps of steepest descent on the function 
    f(x_0, x_1) given in part(a), starting at (init_x0, init_x1).
    The returned value is a list of tuples (x_0, x_1) visited
    by the steepest descent algorithm, and should be of length
    n+1. The parameter alpha is used in place of performing
    a line search.
    
    Example:
    
    >>> steepest_descent_f(0, 0, 0.5, n=0)
    [(0, 0)]
    """
    
q4c_steepest = []

Part (d) [8 pt]¶

Write a function newton_f that uses Newton's method to find a local minimum of the function $f(x_0, x_1)$ given an initial value. You may use matrix inversion from np.linalg.

Use (1, 1) as the initial value and perform 10 iterations of Newton's Method. Save the result in the variable q4d_newton. (The result q4d_newton should be a list of length 11)

def newton_f(init_x0, init_x1, n=5):
    """ Return the $n$ steps of Newton's Method on the function 
    f(x_0, x_1) give in part(a), starting at (init_x0, init_x1).
    The returned value is a list of tuples (x_0, x_1) visited
    by Newton's Method, and should be of length n+1. 
    
    Example:
    
    >>> newton_f(0, 0, n=0)
    [(0, 0)]
    """
    
q4d_newton = []

Part (e) [4 pt]¶

Compare the two algorithms, and the observed convergence rates of $e_k$ of the sequences in parts (c) and (d). Which one converged faster?

Explain your observation and your finding in your PDF writeup.

Part (d) [4 pt]¶

Without running your programs, find three other local minima of $f(x)$. Hint: notice that the value of $f(x)$ does not depend on the signs of $x_0$ and $x_1$.

Include your solution in your PDF writeup.