Copyright (c) 2025. All rights reserved. Released under Apache 2.0 license as described in the file LICENSE.
External Theorem 3: PL ⟹ Normal Hessian Lower Bound #
Statement #
For all m ∈ argminSet f and all ξ ∈ ker(Hess f(m))⊥: D²f(m)(ξ,ξ) ≥ μ · ‖ξ‖²
This bridges the PLMB proof (HessianPL.lean) to the PLAcceleratedNesterovLean formulation.
Two formulations #
Our formulation (using
hessian):hessian f m ξ ξ ≥ μ * ‖ξ‖²wherehessian f x = fderiv ℝ (fderiv ℝ f) x(second Fréchet derivative).PLAcceleratedNesterovLean formulation (using
hessianQuadForm):⟨D(∇f)(m)·ξ, ξ⟩ ≥ μ * ‖ξ‖²wheregradient f x = (toDual ℝ E).symm (fderiv ℝ f x)(Riesz representative).
These are equal for C² functions because: fderiv ℝ f = toDual ℝ E ∘ gradient f (Riesz representation) ⟹ hessian f x ξ ξ = ⟨fderiv ℝ (gradient f) x ξ, ξ⟩ (chain rule + toDual_symm_apply)
References #
- Rebjock & Boumal, Corollary 2.17
- PLMB/HessianPL.lean:
hessian_coercive_on_orthogonal_of_MuPL_impl
The global infimum of f. (PLAcceleratedNesterovLean: fStar)
Equations
- PLAcceleratedNesterovLean.ExternalThm3.fStar f = ⨅ (x : E), f x
Instances For
The Polyak–Łojasiewicz condition on a set U.
Uses ‖fderiv ℝ f x‖ which equals ‖gradient f x‖ by Riesz representation.
(PLAcceleratedNesterovLean: PolyakLojasiewicz)
Equations
Instances For
The gradient of f at x, as the Riesz representative of fderiv ℝ f x.
Matches Mathlib's gradient from Analysis.Calculus.Gradient.Basic.
Equations
Instances For
PLAcceleratedNesterovLean's Hessian quadratic form: ⟨D(∇f)(x)·ξ, ξ⟩.
Here gradient f is the Riesz representative of fderiv ℝ f.
Equations
Instances For
Global minimizers are local minimizers.
At a global minimizer, f(m) = f⋆.
PLAcceleratedNesterovLean PL → our MuPL at global minimizers.
Converts ‖Df(x)‖² ≥ 2μ(f(x) - f⋆) to f(x) - f(m) ≤ (2μ)⁻¹‖Df(x)‖².
For C² functions, ⟨D(∇f)(x)ξ, ξ⟩ = D²f(x)(ξ,ξ).
Proof idea: For every z, ⟨∇f(z), v⟩ = Df(z)·v (Riesz representation,
inner_gradient_left). Fixing v = ξ and differentiating both sides at x
in direction ξ gives LHS = RHS. The chain rule steps use:
Core theorem: At a global minimizer under PL, the Hessian is μ-coercive
on ker(Hess)⊥. Uses hessian f m ξ ξ (second Fréchet derivative).
External Theorem 3: PL ⟹ Normal Hessian Lower Bound (PLAcceleratedNesterovLean language).
For C² functions satisfying the Polyak–Łojasiewicz condition with constant μ on an open set U, at any global minimizer m, and for any ξ in the orthogonal complement of ker(Hess f(m)):
⟨D(∇f)(m)·ξ, ξ⟩ ≥ μ · ‖ξ‖²
This is the PLAcceleratedNesterovLean hessianQuadForm version of the
Hessian coercivity bound.
Note on the projection condition: PLAcceleratedNesterovLean uses
fderiv ℝ π m ξ = 0 (where π is the nearest-point projection onto the
minimizer set) to characterize normal vectors.
Under the Morse-Bott property (proved in Theorem.lean), the minimizer set is a C¹
submanifold with tangent space T = ker(Hess f(m)), and the normal space T⊥ equals
ker(fderiv ℝ π m). So fderiv ℝ π m ξ = 0 → ξ ∈ (hessianKer f m).orthogonal.
Quantified version: ∀ m ∈ argminSet f, ∀ ξ ∈ ker(Hess)⊥, bound holds.