Gradient Descent for Functions of One Variable

Consider the function

$$f(x) = .1 x^2 + \sin(.1 (x-2)^2)$$

The derivative for that function is

$$f'(x) = .2 x+ \cos(.1 (x-2)^2)(.2(x-2))$$

Let's graph the function first:

In [5]:
#Needed to display images inline in Jupyter
%matplotlib inline  
####################################

from numpy import *
from matplotlib.pyplot import *

def f(x):
    return .1*x**2 + sin(.1*(x-2)**2)

def dfdx(x):
    return .2*x+cos(.1*(x-2)**2)*(.2*(x-2))

x = arange(-10, 10, .1)
y = f(x)

figure(1)
plot(x, y)
xlabel("x")
ylabel("$.1 x^2 + sin(.1 (x-2)^2)$")
show()

We can now write the gradient descent algorithm

In [2]:
def grad_descent(f, dfdx, init_x, alpha):
    EPS = 1e-3
    prev_x = init_x-2*EPS
    x = init_x
    
    iter = 0
    while abs(x - prev_x) >  EPS:
        prev_x = x
        x -= alpha*dfdx(x)
        
        print "Iter %d: x = %.2f, f(x) = %.2f"  % (iter, x, f(x))
        iter += 1
    
    return x

Let's try running gradient descent starting from the initial guess of 8.

In [3]:
grad_descent(f, dfdx, 8, 0.1)
Iter 0: x = 7.95, f(x) = 5.93
Iter 1: x = 7.90, f(x) = 5.91
Iter 2: x = 7.85, f(x) = 5.89
Iter 3: x = 7.81, f(x) = 5.87
Iter 4: x = 7.76, f(x) = 5.85
Iter 5: x = 7.72, f(x) = 5.83
Iter 6: x = 7.68, f(x) = 5.81
Iter 7: x = 7.64, f(x) = 5.80
Iter 8: x = 7.60, f(x) = 5.78
Iter 9: x = 7.56, f(x) = 5.77
Iter 10: x = 7.52, f(x) = 5.75
Iter 11: x = 7.48, f(x) = 5.73
Iter 12: x = 7.44, f(x) = 5.72
Iter 13: x = 7.40, f(x) = 5.70
Iter 14: x = 7.35, f(x) = 5.68
Iter 15: x = 7.31, f(x) = 5.66
Iter 16: x = 7.26, f(x) = 5.64
Iter 17: x = 7.22, f(x) = 5.62
Iter 18: x = 7.17, f(x) = 5.59
Iter 19: x = 7.12, f(x) = 5.57
Iter 20: x = 7.06, f(x) = 5.54
Iter 21: x = 7.01, f(x) = 5.50
Iter 22: x = 6.95, f(x) = 5.47
Iter 23: x = 6.89, f(x) = 5.43
Iter 24: x = 6.82, f(x) = 5.38
Iter 25: x = 6.75, f(x) = 5.33
Iter 26: x = 6.67, f(x) = 5.27
Iter 27: x = 6.59, f(x) = 5.21
Iter 28: x = 6.51, f(x) = 5.13
Iter 29: x = 6.42, f(x) = 5.05
Iter 30: x = 6.32, f(x) = 4.95
Iter 31: x = 6.22, f(x) = 4.85
Iter 32: x = 6.12, f(x) = 4.73
Iter 33: x = 6.00, f(x) = 4.60
Iter 34: x = 5.89, f(x) = 4.46
Iter 35: x = 5.76, f(x) = 4.31
Iter 36: x = 5.64, f(x) = 4.15
Iter 37: x = 5.51, f(x) = 3.97
Iter 38: x = 5.37, f(x) = 3.79
Iter 39: x = 5.24, f(x) = 3.61
Iter 40: x = 5.10, f(x) = 3.42
Iter 41: x = 4.96, f(x) = 3.23
Iter 42: x = 4.82, f(x) = 3.04
Iter 43: x = 4.69, f(x) = 2.86
Iter 44: x = 4.55, f(x) = 2.68
Iter 45: x = 4.42, f(x) = 2.51
Iter 46: x = 4.29, f(x) = 2.35
Iter 47: x = 4.17, f(x) = 2.19
Iter 48: x = 4.05, f(x) = 2.04
Iter 49: x = 3.93, f(x) = 1.91
Iter 50: x = 3.81, f(x) = 1.78
Iter 51: x = 3.70, f(x) = 1.66
Iter 52: x = 3.60, f(x) = 1.55
Iter 53: x = 3.49, f(x) = 1.44
Iter 54: x = 3.39, f(x) = 1.35
Iter 55: x = 3.30, f(x) = 1.26
Iter 56: x = 3.21, f(x) = 1.17
Iter 57: x = 3.12, f(x) = 1.10
Iter 58: x = 3.04, f(x) = 1.03
Iter 59: x = 2.95, f(x) = 0.96
Iter 60: x = 2.88, f(x) = 0.90
Iter 61: x = 2.80, f(x) = 0.85
Iter 62: x = 2.73, f(x) = 0.80
Iter 63: x = 2.66, f(x) = 0.75
Iter 64: x = 2.59, f(x) = 0.71
Iter 65: x = 2.53, f(x) = 0.67
Iter 66: x = 2.47, f(x) = 0.63
Iter 67: x = 2.41, f(x) = 0.60
Iter 68: x = 2.35, f(x) = 0.57
Iter 69: x = 2.30, f(x) = 0.54
Iter 70: x = 2.25, f(x) = 0.51
Iter 71: x = 2.20, f(x) = 0.49
Iter 72: x = 2.15, f(x) = 0.46
Iter 73: x = 2.10, f(x) = 0.44
Iter 74: x = 2.06, f(x) = 0.42
Iter 75: x = 2.02, f(x) = 0.41
Iter 76: x = 1.98, f(x) = 0.39
Iter 77: x = 1.94, f(x) = 0.38
Iter 78: x = 1.90, f(x) = 0.36
Iter 79: x = 1.86, f(x) = 0.35
Iter 80: x = 1.83, f(x) = 0.34
Iter 81: x = 1.80, f(x) = 0.33
Iter 82: x = 1.76, f(x) = 0.32
Iter 83: x = 1.73, f(x) = 0.31
Iter 84: x = 1.70, f(x) = 0.30
Iter 85: x = 1.68, f(x) = 0.29
Iter 86: x = 1.65, f(x) = 0.28
Iter 87: x = 1.62, f(x) = 0.28
Iter 88: x = 1.60, f(x) = 0.27
Iter 89: x = 1.57, f(x) = 0.27
Iter 90: x = 1.55, f(x) = 0.26
Iter 91: x = 1.53, f(x) = 0.26
Iter 92: x = 1.51, f(x) = 0.25
Iter 93: x = 1.49, f(x) = 0.25
Iter 94: x = 1.47, f(x) = 0.24
Iter 95: x = 1.45, f(x) = 0.24
Iter 96: x = 1.43, f(x) = 0.24
Iter 97: x = 1.41, f(x) = 0.23
Iter 98: x = 1.40, f(x) = 0.23
Iter 99: x = 1.38, f(x) = 0.23
Iter 100: x = 1.37, f(x) = 0.23
Iter 101: x = 1.35, f(x) = 0.22
Iter 102: x = 1.34, f(x) = 0.22
Iter 103: x = 1.32, f(x) = 0.22
Iter 104: x = 1.31, f(x) = 0.22
Iter 105: x = 1.30, f(x) = 0.22
Iter 106: x = 1.29, f(x) = 0.22
Iter 107: x = 1.28, f(x) = 0.22
Iter 108: x = 1.26, f(x) = 0.21
Iter 109: x = 1.25, f(x) = 0.21
Iter 110: x = 1.24, f(x) = 0.21
Iter 111: x = 1.23, f(x) = 0.21
Iter 112: x = 1.22, f(x) = 0.21
Iter 113: x = 1.22, f(x) = 0.21
Iter 114: x = 1.21, f(x) = 0.21
Iter 115: x = 1.20, f(x) = 0.21
Iter 116: x = 1.19, f(x) = 0.21
Iter 117: x = 1.18, f(x) = 0.21
Iter 118: x = 1.18, f(x) = 0.21
Iter 119: x = 1.17, f(x) = 0.21
Iter 120: x = 1.16, f(x) = 0.21
Iter 121: x = 1.16, f(x) = 0.20
Iter 122: x = 1.15, f(x) = 0.20
Iter 123: x = 1.14, f(x) = 0.20
Iter 124: x = 1.14, f(x) = 0.20
Iter 125: x = 1.13, f(x) = 0.20
Iter 126: x = 1.13, f(x) = 0.20
Iter 127: x = 1.12, f(x) = 0.20
Iter 128: x = 1.12, f(x) = 0.20
Iter 129: x = 1.11, f(x) = 0.20
Iter 130: x = 1.11, f(x) = 0.20
Iter 131: x = 1.10, f(x) = 0.20
Iter 132: x = 1.10, f(x) = 0.20
Iter 133: x = 1.09, f(x) = 0.20
Iter 134: x = 1.09, f(x) = 0.20
Iter 135: x = 1.09, f(x) = 0.20
Iter 136: x = 1.08, f(x) = 0.20
Iter 137: x = 1.08, f(x) = 0.20
Iter 138: x = 1.08, f(x) = 0.20
Iter 139: x = 1.07, f(x) = 0.20
Iter 140: x = 1.07, f(x) = 0.20
Iter 141: x = 1.07, f(x) = 0.20
Iter 142: x = 1.06, f(x) = 0.20
Iter 143: x = 1.06, f(x) = 0.20
Iter 144: x = 1.06, f(x) = 0.20
Iter 145: x = 1.06, f(x) = 0.20
Iter 146: x = 1.05, f(x) = 0.20
Iter 147: x = 1.05, f(x) = 0.20
Iter 148: x = 1.05, f(x) = 0.20
Iter 149: x = 1.05, f(x) = 0.20
Iter 150: x = 1.05, f(x) = 0.20
Iter 151: x = 1.04, f(x) = 0.20
Iter 152: x = 1.04, f(x) = 0.20
Iter 153: x = 1.04, f(x) = 0.20
Iter 154: x = 1.04, f(x) = 0.20
Iter 155: x = 1.04, f(x) = 0.20
Iter 156: x = 1.04, f(x) = 0.20
Iter 157: x = 1.03, f(x) = 0.20
Iter 158: x = 1.03, f(x) = 0.20
Iter 159: x = 1.03, f(x) = 0.20
Iter 160: x = 1.03, f(x) = 0.20
Iter 161: x = 1.03, f(x) = 0.20
Iter 162: x = 1.03, f(x) = 0.20
Iter 163: x = 1.03, f(x) = 0.20
Iter 164: x = 1.03, f(x) = 0.20
Iter 165: x = 1.02, f(x) = 0.20
Iter 166: x = 1.02, f(x) = 0.20
Iter 167: x = 1.02, f(x) = 0.20
Iter 168: x = 1.02, f(x) = 0.20
Out[3]:
1.0211110523835374

Let's demonstrate that we could end up in a bad local minimum:

In [4]:
grad_descent(f, dfdx, -10, 0.1)
Iter 0: x = -9.86, f(x) = 10.72
Iter 1: x = -9.65, f(x) = 10.16
Iter 2: x = -9.33, f(x) = 8.98
Iter 3: x = -8.93, f(x) = 7.38
Iter 4: x = -8.57, f(x) = 6.36
Iter 5: x = -8.36, f(x) = 6.03
Iter 6: x = -8.25, f(x) = 5.92
Iter 7: x = -8.18, f(x) = 5.88
Iter 8: x = -8.14, f(x) = 5.87
Iter 9: x = -8.11, f(x) = 5.86
Iter 10: x = -8.09, f(x) = 5.86
Iter 11: x = -8.07, f(x) = 5.86
Iter 12: x = -8.06, f(x) = 5.86
Iter 13: x = -8.06, f(x) = 5.86
Iter 14: x = -8.05, f(x) = 5.85
Iter 15: x = -8.05, f(x) = 5.85
Iter 16: x = -8.04, f(x) = 5.85
Iter 17: x = -8.04, f(x) = 5.85
Iter 18: x = -8.04, f(x) = 5.85
Iter 19: x = -8.04, f(x) = 5.85
Iter 20: x = -8.04, f(x) = 5.85
Out[4]:
-8.036475915977233