Although the following refinement algorithm has absolutely nothing to do with neural networks, the idea for it was inspired by how the neural net backpropagation learning algorithm works. In backpropagation, a function measures the error in the output of a neural network, given a certain input a. The neural network ``learns'' by adjusting its internal variables to reduce its output error; learning stops when the output error is sufficiently small. The way it reduces the error is by taking partial derivatives of the error function with respect to all the internal variables , producing the gradient of the error with respect to . Moving a small amount in the direction opposite the gradient should then reduce the error. The amount by which is moved along the negative gradient is called the learning rate. In one sentence, backpropagation says, ``Find a direction to move that reduces the error, and then move a little bit in that direction.''
The inspiration for shadowing is this: once we have computed the 1-step error vectors along the noisy trajectory , it seems reasonable that the 1-step error at step i would be reduced if we simply subtracted a fraction of from , for some . This is the basis for what I call a Local Error Subtraction method.
A naïve implementation of the above idea may be simply to set for i from 1 to S. However, this does not define how to change the initial point , because there is no . Since any new true trajectory must have a different initial condition, we must find a way to correct .
To allow corrections to , I introduce the backward error,
where integrates the solution backwards from time i+1 to time i, and is defined on , where S is the number of shadow steps. Thus each timestep from 1 to S-1 has both a forward error and backward error ; timestep 0 has only a backward error , and timestep S has only a forward error .
Subtracting some linear combination from (setting ) does not work for any reasonable values of that I have tried (between 0.1 and 0.9). I believe this is because 1-step errors computed in the forward direction will be slightly unstable in their components that lie in the unstable subspace, and backwards 1-step errors will be unstable in their components that lie in the stable subspace. The method works, however, if we rewrite using equation 1.2 (which was ), and , and update as
In other words, we use only the components of the 1-step error that are computationally stable.
The SLES method seems to work about as fast as the optimized GHYS method introduced in the next chapter. For example, using , 1-step errors are generally reduced by about a factor of 10 per refinement iteration, as one would expect when subtracting away 90% of the error. However, I have not extensively tested it, and since it seems to have a weaker theoretical basis than the GHYS method, I have not used it in any of the results mentioned in later chapters. However, I have no reason to believe that the method is unreliable, so further testing in the future seems warranted.
Finally, note that SLES makes no explicit use of the linear map . Currently, the only use of the resolvents in SLES is implicitly through the construction of the stable and unstable directions. If a method of computing the stable and unstable directions can be found that doesn't use the linear maps (eg., eigenvector decomposition as mentioned above), then the SLES method will not use the linear maps at all, and they will not need to be computed. Since computing the linear maps (for an ODE system) is currently by far the most computationally intensive section of the program, getting rid of them entirely is an exciting prospect.
It is perhaps important to note that there is a fundamental difference between the GHYS refinement procedure, which I call a global method, and SLES, which I call a local method. Let the initial noisy orbit be , with the superscript referring to the refinement iteration. Assuming the GHYS refinement procedure can be written as a Newton's method, it can be summarized as
where g was defined near the beginning of this chapter as the function computing the forward 1-step errors on the entire orbit, and g' is the Jacobian of g. Thus the entire orbit is updated using a single (large) equation. SLES, on the other hand, applies the local correction formula of equation 1.9 to each point independently; namely
One consequence of this difference occurs when the 1-step errors in the GHYS method ``explode'' and cause the iteration to fail. The explosion seems to occur much faster using GHYS than using SLES. I believe this is because in GHYS, all the 1-step errors along the entire trajectory contribute to each other's computation of the correction, so that if a single 1-step error is too big for its linear map to be valid, then the corrections for all the points will be invalid. However, in SLES, there is no such dependence between points at arbitrary distances along the trajectory, so a point with large 1-step error can propagate its instability to neighbors j shadow steps away only after j refinement iterations.
Currently, the only known way to isolate glitches using the GHYS method is the way QT did it, namely to try shadowing for longer and longer times until the orbit explodes, and then to do a binary search for shorter and longer shadows until the shadow step at which the glitch occurs is pinpointed. This is alot of work. Because of the slow way glitches propagate their effects in SLES, it may be possible to isolate glitches simply by their large errors and that of their neighbors. It may even be possible to pinpoint multiple glitches along a single trajectory. For example, points and would be glitches if we could shadow the trajectory from 0 to and from to and from to S, but not shadow any part of the trajectory containing or . This introduces the concepts of shadowable segments, although it is not clear what use such a concept is.