Lyapunov Contraction Step 1: Descent Lemma and Parameter Identities #

Descent Lemma #

For L-smooth f with step x₊ = x - (1/L)∇f(x): f(x₊) ≤ f(x) - (1/(2L))‖∇f(x)‖²

Parameter Identities #

With ρ = (1-a)/(1+a) and λ = (1+a)²/(2(1-a)): (1+a)ρ = 1-a λρ² = (1-a)/2

Velocity Coefficient Bound #

If ε·η ≤ a and 0 ≤ a < 1: (1-a)² + ε·η·(1-a) ≤ 1-a (1-a)(1+a) ≤ (1+a)² = 2(1-a)λ

theorem PLAcceleratedNesterovLean.descent_lemma (fx fy grad_sq L : ℝ) (hL : 0 < L) (hsmooth : fy ≤ fx - 1 / L * grad_sq + L / 2 * (1 / L) ^ 2 * grad_sq) :

fy ≤ fx - 1 / (2 * L) * grad_sq

Descent lemma: f(x - η·∇f(x)) ≤ f(x) - (η/2)·‖∇f(x)‖² for η = 1/L. This is stated abstractly: if f(y) ≤ f(x) + ⟨∇f(x), y-x⟩ + (L/2)‖y-x‖² and y = x - (1/L)·∇f(x), then f(y) ≤ f(x) - (1/(2L))·‖∇f(x)‖².

source

theorem PLAcceleratedNesterovLean.rho_identity (a ρ : ℝ) (ha : 1 + a ≠ 0) (hρ : ρ = (1 - a) / (1 + a)) :

(1 + a) * ρ = 1 - a

Parameter identity: (1 + a) · ρ = 1 - a when ρ = (1-a)/(1+a).

source

theorem PLAcceleratedNesterovLean.lambda_rho_sq (a ρ lam : ℝ) (ha_ne : 1 + a ≠ 0) (ha_pos : 1 - a ≠ 0) (hρ : ρ = (1 - a) / (1 + a)) (hlam : lam = (1 + a) ^ 2 / (2 * (1 - a))) :

lam * ρ ^ 2 = (1 - a) / 2

Parameter identity: λ · ρ² = (1-a)/2 when ρ = (1-a)/(1+a) and λ = (1+a)²/(2(1-a)).

source

theorem PLAcceleratedNesterovLean.velocity_normal_coeff (a εη : ℝ) (_ha : 0 ≤ a) (ha1 : a ≤ 1) (_hεη : 0 ≤ εη) (hbound : εη ≤ a) :

(1 - a) ^ 2 + εη * (1 - a) ≤ 1 - a

Velocity normal coefficient bound: (1-a)² + εη(1-a) ≤ (1-a) when εη ≤ a.

source

theorem PLAcceleratedNesterovLean.tangential_coeff (a : ℝ) (ha : 0 ≤ a) :

(1 - a) * (1 + a) ≤ (1 + a) ^ 2

Tangential coefficient bound: (1-a)(1+a) ≤ (1+a)².

source

theorem PLAcceleratedNesterovLean.lsmooth_qub {d : ℕ} (f : E d → ℝ) (L : NNReal) {V : Set (E d)} (hV : IsOpen V) (hf_diff : DifferentiableOn ℝ f V) (hf_lip : LipschitzOnWith L (gradient f) V) (x w : E d) (hx_V : x ∈ V) (hseg_V : ∀ (t : ℝ), 0 ≤ t → t ≤ 1 → x + t • w ∈ V) :

f (x + w) - f x ≤ inner ℝ (gradient f x) w + ↑L / 2 * ‖w‖ ^ 2

Quadratic Upper Bound (QUB) from L-smoothness: for any x, w, f(x + w) - f(x) ≤ ⟨∇f(x), w⟩ + L/2·‖w‖².

source

theorem PLAcceleratedNesterovLean.lsmooth_descent_at {d : ℕ} (f : E d → ℝ) (L : NNReal) (hL : 0 < ↑L) {V : Set (E d)} (hV : IsOpen V) (hf_diff : DifferentiableOn ℝ f V) (hf_lip : LipschitzOnWith L (gradient f) V) (x : E d) (hx_V : x ∈ V) (hseg_V : ∀ (t : ℝ), 0 ≤ t → t ≤ 1 → x + t • -((1 / ↑L) • gradient f x) ∈ V) (η : ℝ) (hη : η = 1 / ↑L) :

f (x - η • gradient f x) ≤ f x - η / 2 * ‖gradient f x‖ ^ 2

L-smooth descent: f(x - η·∇f(x)) ≤ f(x) - (η/2)·‖∇f(x)‖² when η = 1/L and the gradient of f is L-Lipschitz. Combines the quadratic upper bound from L-smoothness with the arithmetic simplification (descent_lemma).

Documentation

LeanPool.PLAcceleratedNesterovLean.Convergence.LyapunovContraction.Step1

Lyapunov Contraction Step 1: Descent Lemma and Parameter Identities #

Descent Lemma #

Parameter Identities #

Velocity Coefficient Bound #