About | Help  
  
 
WebsterComputerMath
 
ABCDEFGHIJKLMNOPQRSTUVWXYZ
 
VaVbVcVdVeVfVgVhViVjVkVlVmVnVoVpVqVrVsVtVuVvVwVxVyVz
 

VALUE ITERATION

Value iteration - This is an algorithm for infinite horizon [stochastic] dynamic programs that proceeds by successive approximation to satisfy the fundamental equation:

F(s) = Opt(r(x, s) + a Sum_s'(P(x, s, s')F(s'))),
where a is a discount rate . The successive approximation becomes the DP forward equation. If 0 < a < 1, this is a fixed point , and Banach's theorem yields convergence because then `Opt' is a contraction map . Even when there is no discounting, policy iteration can apply.