The on-line interpreter.

New approach: compute and execute incrementally optimal policies on-line using the interpreter.

- Take a program p and a current situation s, compute off-line an optimal policy and the residual program p'

- Determine the first action a of

- Execute the action a in the real world

- If a is stochastic, get sensory information necessary to identify which outcome of a has actually occurred. If a is a sense action, then connect to sensors and get data.

- Repeat.

The cycle of computing an optimal policy and remaining program, executing the first action and getting sensory information (if necessary) repeats until the program completes or execution fails.