Stochastic gradient descent

Layer 0 — Mathematicsin the numerical-analysis subtree

θ_{k+1} = θ_k − η_k ∇ f_i(θ_k), i random. Robbins-Monro convergence; momentum, Nesterov, Adam. ML workhorse.

Related concepts