Lai-Robbins asymptotic regret lower bound R_T >= ln(T) sum Delta_i / KL

Layer 0 — Mathematicsin the Stochastic Processes subtree

Lai-Robbins lower bound (Lai-Robbins 1985 *Adv Appl Math* 6, 4) is the asymptotic information-theoretic floor on cumulative regret R_T = T mu^* - E[sum_t mu_{A_t}] in stochastic multi-armed-bandit problems. For any uniformly-good policy…

Related concepts

Kullback-Leibler divergence

Explore Lai-Robbins asymptotic regret lower bound R_T >= ln(T) sum Delta_i / KL on the interactive knowledge graph →