Trainability: exponential concentration and barren plateaus #
The unifying notion behind barren plateaus (McClean et al. 2018) and
quantum-kernel concentration (Thanasilp et al. 2022, Def. 1) is exponential
concentration: a quantity indexed by system size n (a loss, a gradient variance,
or a kernel value) deviates from a fixed value μ by at most C / b ^ n for some
b > 1. The practical consequence is that the quantity becomes exponentially
flat — it converges to μ, so resolving it requires exponentially many samples.
This module gives the definition and its convergence consequence, and records the
barren-plateau models on top of it in the GroverModel/ParamShiftModel style: the
hard Haar / t-design / Weingarten input (the variance bound) is bundled as a
hypothesis, and the trainability consequence is derived.
Sources: McClean, Boixo, Smelyanskiy, Babbush, Neven (2018); Cerezo, Sone, Volkoff, Cincio, Coles (2021); Ragone et al. (2023); Thanasilp, Wang, Cerezo, Holmes (2022).
An exponentially concentrated quantity converges to its concentration value: the landscape becomes exponentially flat.
A model has a barren plateau when its loss/gradient variance is
exponentially concentrated to 0 (so the trainable signal vanishes with system
size).
Equations
- QuantumAlg.HasBarrenPlateau variance = QuantumAlg.ExpConcentrated variance 0
Instances For
Under a barren plateau the variance vanishes in the large-system limit.
Lie-algebraic barren plateaus #
Lie-algebraic barren plateaus (Ragone et al. 2023). In the simple-DLA case the
loss variance is P_g(ρ) P_g(O) / dim(g) (their Eq. (10)); bundling the numerator and
the DLA dimension, an exponentially large dynamical Lie algebra forces a barren
plateau.
dim gas a function of the system size.- numer : ℝ
The
g-purity numeratorP_g(ρ) P_g(O). The numerator is nonnegative.
The DLA dimension is positive.
The loss variance.
Ragone et al. (2023), Eq. (10): variance
= P_g(ρ) P_g(O) / dim(g).
Instances For
An exponentially large dynamical Lie algebra forces a barren plateau.
Cost-function-dependent barren plateaus #
Cost-function-dependent barren plateaus (Cerezo et al. 2021): a global cost exhibits a barren plateau (exponentially concentrated gradient variance), whereas a local cost is trainable (its gradient variance has a polynomial lower bound).
Gradient variance of the global cost.
Gradient variance of the local cost.
- global_bp : HasBarrenPlateau self.globalVariance
The global cost has a barren plateau.
- local_lb (n : ℕ) : 0 < n → 1 / ↑n ≤ self.localVariance n
The local cost keeps a polynomial lower bound.
Instances For
The global cost's gradient vanishes (barren plateau).
The local cost's gradient variance stays strictly positive (trainable).
Quantum-kernel concentration #
Quantum-kernel concentration (Thanasilp et al. 2022): the kernel value
concentrates exponentially to a fixed κ₀, so a polynomial number of measurement
shots cannot distinguish inputs (the model becomes input-independent).
This is the abstract deterministic-sequence form. The genuine probabilistic result —
a concrete quantum kernel whose data-averaged value provably concentrates exponentially,
derived from first principles with no Haar assumption — is
LeanPool.LeanQuantumAlg.ryKernel_concentrates in QuantumAlg/Primitives/KernelConcentration.lean,
built on the probabilistic engine LeanPool.LeanQuantumAlg.ExpConcentratedProb.
Equations
- QuantumAlg.KernelConcentration kernel κ₀ = QuantumAlg.ExpConcentrated kernel κ₀
Instances For
A concentrated kernel converges to its concentration value.
Geometric/equivariant QML trainability #
Geometric/equivariant QML trainability (Ragone et al. 2022 + the DLA variance law). A symmetry-structured model whose dynamical Lie algebra has only polynomial dimension keeps a polynomial lower bound on its gradient variance, hence avoids a barren plateau.
Gradient variance.
- deg : ℕ
Polynomial degree of the lower bound.
The variance is bounded below by
1 / n ^ deg(polynomial trainability).
Instances For
A geometric/equivariant model with polynomial dynamical Lie algebra has strictly positive (not exponentially vanishing) gradient variance: it is trainable.