Documentation

LeanPool.PLAcceleratedNesterovLean.MorseBott.NormalHessianBound

Copyright (c) 2025. All rights reserved. Released under Apache 2.0 license as described in the file LICENSE.

External Theorem 3: PL ⟹ Normal Hessian Lower Bound #

Statement #

For all m ∈ argminSet f and all ξ ∈ ker(Hess f(m))⊥: D²f(m)(ξ,ξ) ≥ μ · ‖ξ‖²

This bridges the PLMB proof (HessianPL.lean) to the PLAcceleratedNesterovLean formulation.

Two formulations #

  1. Our formulation (using hessian): hessian f m ξ ξ ≥ μ * ‖ξ‖² where hessian f x = fderiv ℝ (fderiv ℝ f) x (second Fréchet derivative).

  2. PLAcceleratedNesterovLean formulation (using hessianQuadForm): ⟨D(∇f)(m)·ξ, ξ⟩ ≥ μ * ‖ξ‖² where gradient f x = (toDual ℝ E).symm (fderiv ℝ f x) (Riesz representative).

These are equal for C² functions because: fderiv ℝ f = toDual ℝ E ∘ gradient f (Riesz representation) ⟹ hessian f x ξ ξ = ⟨fderiv ℝ (gradient f) x ξ, ξ⟩ (chain rule + toDual_symm_apply)

References #

The set of global minimizers of f. (PLAcceleratedNesterovLean: argminSet)

Equations
Instances For
    noncomputable def PLAcceleratedNesterovLean.ExternalThm3.fStar {E : Type u_1} (f : E) :

    The global infimum of f. (PLAcceleratedNesterovLean: fStar)

    Equations
    Instances For

      The Polyak–Łojasiewicz condition on a set U. Uses ‖fderiv ℝ f x‖ which equals ‖gradient f x‖ by Riesz representation. (PLAcceleratedNesterovLean: PolyakLojasiewicz)

      Equations
      Instances For

        The gradient of f at x, as the Riesz representative of fderiv ℝ f x. Matches Mathlib's gradient from Analysis.Calculus.Gradient.Basic.

        Equations
        Instances For

          PLAcceleratedNesterovLean's Hessian quadratic form: ⟨D(∇f)(x)·ξ, ξ⟩. Here gradient f is the Riesz representative of fderiv ℝ f.

          Equations
          Instances For

            Global minimizers are local minimizers.

            theorem PLAcceleratedNesterovLean.ExternalThm3.fStar_eq_of_argmin {E : Type u_1} [NormedAddCommGroup E] (f : E) (m : E) (hm : m argminSet f) :
            fStar f = f m

            At a global minimizer, f(m) = f⋆.

            theorem PLAcceleratedNesterovLean.ExternalThm3.pl_to_muPL {E : Type u_1} [NormedAddCommGroup E] [InnerProductSpace E] (f : E) (μ : ) (U : Set E) (m : E) (hm : m argminSet f) (hU : U nhds m) (hPL : PolyakLojasiewicz f μ U) :
            MuPL f μ m

            PLAcceleratedNesterovLean PL → our MuPL at global minimizers. Converts ‖Df(x)‖² ≥ 2μ(f(x) - f⋆) to f(x) - f(m) ≤ (2μ)⁻¹‖Df(x)‖².

            For C² functions, ⟨D(∇f)(x)ξ, ξ⟩ = D²f(x)(ξ,ξ).

            Proof idea: For every z, ⟨∇f(z), v⟩ = Df(z)·v (Riesz representation, inner_gradient_left). Fixing v = ξ and differentiating both sides at x in direction ξ gives LHS = RHS. The chain rule steps use:

            • innerSL ℝ ξ : E →L[ℝ] ℝ (genuine CLM, no star) composed with gradient f
            • Evaluation at ξ composed with fderiv ℝ f Both are standard CLM chain rules.
            theorem PLAcceleratedNesterovLean.ExternalThm3.hessian_coercive_globalMin_PL {E : Type u_1} [NormedAddCommGroup E] [InnerProductSpace E] [FiniteDimensional E] (f : E) (μ : ) (U : Set E) (m ξ : E) (hf : ContDiffAt 2 f m) (hPL : PolyakLojasiewicz f μ U) (hm : m argminSet f) (hU : U nhds m) ( : ξ (hessianKer f m)) :
            ((hessian f m) ξ) ξ μ * ξ ^ 2

            Core theorem: At a global minimizer under PL, the Hessian is μ-coercive on ker(Hess)⊥. Uses hessian f m ξ ξ (second Fréchet derivative).

            theorem PLAcceleratedNesterovLean.ExternalThm3.hessianQuadForm_bound {E : Type u_1} [NormedAddCommGroup E] [InnerProductSpace E] [FiniteDimensional E] (f : E) (μ : ) (U : Set E) (m ξ : E) (hf : ContDiffAt 2 f m) (hPL : PolyakLojasiewicz f μ U) (hm : m argminSet f) (hU : U nhds m) ( : ξ (hessianKer f m)) :
            hessianQuadForm f m ξ μ * ξ ^ 2

            External Theorem 3: PL ⟹ Normal Hessian Lower Bound (PLAcceleratedNesterovLean language).

            For C² functions satisfying the Polyak–Łojasiewicz condition with constant μ on an open set U, at any global minimizer m, and for any ξ in the orthogonal complement of ker(Hess f(m)):

            ⟨D(∇f)(m)·ξ, ξ⟩ ≥ μ · ‖ξ‖²

            This is the PLAcceleratedNesterovLean hessianQuadForm version of the Hessian coercivity bound.

            Note on the projection condition: PLAcceleratedNesterovLean uses fderiv ℝ π m ξ = 0 (where π is the nearest-point projection onto the minimizer set) to characterize normal vectors. Under the Morse-Bott property (proved in Theorem.lean), the minimizer set is a C¹ submanifold with tangent space T = ker(Hess f(m)), and the normal space T⊥ equals ker(fderiv ℝ π m). So fderiv ℝ π m ξ = 0 → ξ ∈ (hessianKer f m).orthogonal.

            theorem PLAcceleratedNesterovLean.ExternalThm3.hessianQuadForm_bound_forall {E : Type u_1} [NormedAddCommGroup E] [InnerProductSpace E] [FiniteDimensional E] (f : E) (μ : ) (U : Set E) (hf : ContDiff 2 f) (hPL : PolyakLojasiewicz f μ U) (hU_open : IsOpen U) (hS_sub : argminSet f U) (m : E) :
            m argminSet fξ(hessianKer f m), hessianQuadForm f m ξ μ * ξ ^ 2

            Quantified version: ∀ m ∈ argminSet f, ∀ ξ ∈ ker(Hess)⊥, bound holds.