Documentation

LeanPool.PLAcceleratedNesterovLean.Convergence.CurvAbsorb.Algebraic

Curvature Absorption Helper Lemmas #

Helper lemmas for the curvature absorption proof (h_curv_absorb_hyp in Basic.lean). These establish the key perturbation bounds:

  1. Step bound: ‖hn‖ ≤ η·‖gn‖ + √η·|ρ|·(‖vn‖ + √η·‖gn‖)
  2. MVT bound: ‖P(y-x) - (π y - π x)‖ ≤ ε₁·‖y-x‖
  3. Kills-normal bound: ‖P en‖ ≤ ε₁·‖en‖
theorem PLAcceleratedNesterovLean.nesterov_step_bound {d : } (f : E d) (η ρ : ) (s : NesterovState d) ( : 0 η) :

The Nesterov step satisfies hn = -η·gn + √η·ρ·(vn - √η·gn), so ‖hn‖ ≤ η·‖gn‖ + √η·|ρ|·(‖vn‖ + √η·‖gn‖).

theorem PLAcceleratedNesterovLean.xi_bound_mvt {d : } (P : E d →L[] E d) (π : E dE d) (x y c : E d) (r : ) (hxs : x Metric.ball c r) (hys : y Metric.ball c r) (hπ_diff : zMetric.ball c r, DifferentiableAt π z) (ε₁ : ) (hbound : zMetric.ball c r, fderiv π z - P ε₁) :
P (y - x) - (π y - π x) ε₁ * y - x

MVT bound: if ‖fderiv π z - P‖ ≤ ε₁ for all z in a ball, then ‖P(y-x) - (π y - π x)‖ ≤ ε₁·‖y-x‖ for x, y in the ball.

theorem PLAcceleratedNesterovLean.proj_normal_bound {d : } (P : E d →L[] E d) (π : E dE d) (x : E d) (hkills : (fderiv π (π x)) (x - π x) = 0) (ε₁ : ) (hDπ_close : fderiv π (π x) - P ε₁) :
P (x - π x) ε₁ * x - π x

If Dπ(π x) kills (x - π x) and ‖Dπ(π x) - P‖ ≤ ε₁, then ‖P(x - π x)‖ ≤ ε₁·‖x - π x‖.

theorem PLAcceleratedNesterovLean.curv_absorption_algebraic {E' : Type u_1} [NormedAddCommGroup E'] [InnerProductSpace E'] (wn ξn gn Pen hn en : E') (sm sa Ln : ) (hsm_nn : 0 sm) (hsa_pos : 0 < sa) (hLn_nn : 0 Ln) (ε₁ : ) (hε₁_pos : 0 < ε₁) (hε₁_le1 : ε₁ 1) ( : ξn ε₁ * hn) (hPe : Pen ε₁ * en) (C_wh : ) (_hC_wh_nn : 0 C_wh) (hwh : sm ^ 2 * wn ^ 2 + hn ^ 2 C_wh * Ln) (C_h : ) (_hC_h_nn : 0 C_h) (hh_sq : hn ^ 2 C_h * Ln) (C_ge : ) (_hC_ge_nn : 0 C_ge) (hge : gn * en C_ge * Ln) (θ : ) (_hθ_pos : 0 < θ) (habs : ε₁ * (C_wh / 2 + sm ^ 2 / 2 * C_h + sa * C_ge) θ * sa) :
sm * inner wn ξn + sm ^ 2 / 2 * ξn ^ 2 + sa * |inner gn Pen| θ * sa * Ln

Core algebraic assembly for curvature absorption. Given that ‖ξ‖ and ‖Pe‖ each have an ε₁ factor from Dπ continuity, and all squared norms are bounded by constants times Ln, conclude δ_curv + proj_err ≤ a/2 * Ln.

The key algebraic steps are:

  • Young's inequality: sm·‖w‖·‖h‖ ≤ (sm²·‖w‖² + ‖h‖²)/2 (absorbs √μ')
  • ε₁² ≤ ε₁ (since ε₁ ≤ 1), so ‖ξ‖² ≤ ε₁·‖h‖²
  • Everything factors as ε₁ · K · Ln, and ε₁ · K ≤ a/2 by hypothesis