Documentation

LeanPool.RlTheoryInLean

RL Theory in Lean #

Source: arxiv:2511.03618 Authors: Shangtong Zhang Status: verified Main declarations: StochasticMatrix.stationary_distribution_exists Tags: probability, reinforcement-learning, stochastic-matrices MSC: 62L20, 60J10

Provenance #

Imported from https://github.com/ShangtongZhang/rl-theory-in-lean (MIT-licensed upstream; relicensed into Lean Pool under Apache 2.0 with the upstream author's copyright preserved). Accompanies the paper Towards Formalizing Reinforcement Learning Theory (arXiv:2511.03618). Ported from Lean v4.28.0-rc1 to Lean Pool's v4.30.0-rc2. This Lean Pool import keeps the warning-clean core infrastructure.

Mathematical overview #

The project mirrors Mathlib's directory layout and develops stochastic row-stochastic matrices, Doeblin minorization and geometric mixing for finite chains, finite Markov-chain kernels and trajectory measures, measure/kernel helper lemmas, and discrete Gronwall inequalities used in stochastic approximation.

def Aperiodic {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

Alias of StochasticMatrix.Aperiodic.

A stochastic matrix is aperiodic if for each state, the GCD of return times is 1

Equations

@Aperiodic = @StochasticMatrix.Aperiodic

Instances For

def DoeblinMinorization {S : Type u} [Fintype S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

Alias of StochasticMatrix.DoeblinMinorization.

A Doeblin minorization for a finite stochastic matrix.

Equations

@DoeblinMinorization = @StochasticMatrix.DoeblinMinorization

Instances For

def GeometricMixing {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

Alias of StochasticMatrix.GeometricMixing.

Geometric convergence to stationarity in total variation/L¹ distance.

Equations

@GeometricMixing = @StochasticMatrix.GeometricMixing

Instances For

def RowStochastic {S : Type u} [Fintype S] (P : Matrix S S ℝ) :

Alias of StochasticMatrix.RowStochastic.

A matrix whose rows are stochastic vectors.

Equations

@RowStochastic = @StochasticMatrix.RowStochastic

Instances For

def Stationary {S : Type u} [Fintype S] (μ : S → ℝ) (P : Matrix S S ℝ) :

Alias of StochasticMatrix.Stationary.

A stationary distribution for a matrix.

Equations

@Stationary = @StochasticMatrix.Stationary

Instances For

def cesaroAverage {S : Type u} [Fintype S] [DecidableEq S] (x₀ : S → ℝ) (P : Matrix S S ℝ) (n : ℕ) :

S → ℝ

Alias of StochasticMatrix.cesaroAverage.

The Cesaro average of the first n + 1 iterates of a stochastic vector.

Equations

@cesaroAverage = @StochasticMatrix.cesaroAverage

Instances For

theorem cesaro_average_almost_invariant {S : Type u} [Fintype S] [DecidableEq S] (x₀ : S → ℝ) [StochasticMatrix.StochasticVec x₀] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (n : ℕ) :

‖WithLp.toLp 1 (Matrix.vecMul (StochasticMatrix.cesaroAverage x₀ P n) P - StochasticMatrix.cesaroAverage x₀ P n)‖ ≤ 2 / (↑n + 1)

Alias of StochasticMatrix.cesaro_average_almost_invariant.

theorem cesaro_average_is_svec {S : Type u} [Fintype S] [DecidableEq S] (x₀ : S → ℝ) [StochasticMatrix.StochasticVec x₀] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (n : ℕ) :

StochasticMatrix.StochasticVec (StochasticMatrix.cesaroAverage x₀ P n)

Alias of StochasticMatrix.cesaro_average_is_svec.

theorem chapman_kolmogorov_eq_ge {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (m n : ℕ) (i j k : S) :

(P ^ (m + n)) i j ≥ (P ^ m) i k * (P ^ n) k j

Alias of StochasticMatrix.chapman_kolmogorov_eq_ge.

theorem eventually_positive {S : Type u} [Fintype S] [DecidableEq S] [Nonempty S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] [StochasticMatrix.Irreducible P] [StochasticMatrix.Aperiodic P] :

∃ (N : ℕ), ∀ (n : ℕ) (i j : S), N ≤ n → 0 < (P ^ n) i j

Alias of StochasticMatrix.eventually_positive.

theorem multi_step_stationary {S : Type u} [Fintype S] [DecidableEq S] (μ : S → ℝ) [StochasticMatrix.StochasticVec μ] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (n : ℕ) [StochasticMatrix.Stationary μ P] :

StochasticMatrix.Stationary μ (P ^ n)

Alias of StochasticMatrix.multi_step_stationary.

theorem pos_of_stationary {S : Type u} [Fintype S] [DecidableEq S] (μ : S → ℝ) [StochasticMatrix.StochasticVec μ] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] [StochasticMatrix.Irreducible P] [StochasticMatrix.Stationary μ P] (s : S) :

0 < μ s

Alias of StochasticMatrix.pos_of_stationary.

def returnTimes {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) (i : S) :

Alias of StochasticMatrix.returnTimes.

The set of positive return times for state i

Equations

@returnTimes = @StochasticMatrix.returnTimes

Instances For

theorem return_times_add_mem {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (i : S) {a b : ℕ} (ha : a ∈ StochasticMatrix.returnTimes P i) (hb : b ∈ StochasticMatrix.returnTimes P i) :

a + b ∈ StochasticMatrix.returnTimes P i

Alias of StochasticMatrix.return_times_add_mem.

Return times are closed under addition (used via AddSubmonoid.closure)

def smatAsOperator {S : Type u} [Fintype S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

↑(StochasticMatrix.Simplex S) → ↑(StochasticMatrix.Simplex S)

Alias of StochasticMatrix.smatAsOperator.

The affine action of a stochastic matrix on the probability simplex.

Equations

@smatAsOperator = @StochasticMatrix.smatAsOperator

Instances For

theorem smat_as_operator_iter {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (n : ℕ) :

(StochasticMatrix.smatAsOperator P)^[n] = fun (μ : ↑(StochasticMatrix.Simplex S)) => ⟨WithLp.toLp 1 (Matrix.vecMul (↑μ).ofLp (P ^ n)), ⋯⟩

Alias of StochasticMatrix.smat_as_operator_iter.

theorem smat_contraction_in_simplex {S : Type u} [Fintype S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] [StochasticMatrix.DoeblinMinorization P] :

∃ (K : NNReal), 0 < K ∧ ContractingWith K (StochasticMatrix.smatAsOperator P)

Alias of StochasticMatrix.smat_contraction_in_simplex.

theorem smat_minorizable_with_large_pow {S : Type u} [Fintype S] [DecidableEq S] [Nonempty S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] [StochasticMatrix.Irreducible P] [StochasticMatrix.Aperiodic P] :

∃ (N : ℕ), 1 ≤ N ∧ StochasticMatrix.DoeblinMinorization (P ^ N)

Alias of StochasticMatrix.smat_minorizable_with_large_pow.

theorem smat_mul_smat_is_smat {S : Type u} [Fintype S] (P Q : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] [StochasticMatrix.RowStochastic Q] :

StochasticMatrix.RowStochastic (P * Q)

Alias of StochasticMatrix.smat_mul_smat_is_smat.

theorem smat_nonexpansive_in_l1 {S : Type u} [Fintype S] (Q : Matrix S S ℝ) [StochasticMatrix.RowStochastic Q] (x y : S → ℝ) :

‖WithLp.toLp 1 (Matrix.vecMul x Q - Matrix.vecMul y Q)‖₊ ≤ ‖WithLp.toLp 1 (x - y)‖₊

Alias of StochasticMatrix.smat_nonexpansive_in_l1.

theorem smat_pow_is_smat {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] (n : ℕ) :

StochasticMatrix.RowStochastic (P ^ n)

Alias of StochasticMatrix.smat_pow_is_smat.

theorem smat_pow_nonexpansive_in_l1 {S : Type u} [Fintype S] [DecidableEq S] (Q : Matrix S S ℝ) [StochasticMatrix.RowStochastic Q] (n : ℕ) (x y : S → ℝ) :

‖WithLp.toLp 1 (Matrix.vecMul x (Q ^ n) - Matrix.vecMul y (Q ^ n))‖₊ ≤ ‖WithLp.toLp 1 (x - y)‖₊

Alias of StochasticMatrix.smat_pow_nonexpansive_in_l1.

theorem stationary_distribution_exists {S : Type u} [Fintype S] [Nonempty S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

∃ (μ : S → ℝ), StochasticMatrix.StochasticVec μ ∧ StochasticMatrix.Stationary μ P

Alias of StochasticMatrix.stationary_distribution_exists.

theorem stationary_distribution_uniquely_exists {S : Type u} [Fintype S] [DecidableEq S] [Nonempty S] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] [StochasticMatrix.Aperiodic P] [StochasticMatrix.Irreducible P] :

∃! μ : S → ℝ, StochasticMatrix.StochasticVec μ ∧ StochasticMatrix.Stationary μ P

Alias of StochasticMatrix.stationary_distribution_uniquely_exists.

theorem sum_svec_mul_smat_eq_one {S : Type u} [Fintype S] (μ : S → ℝ) [StochasticMatrix.StochasticVec μ] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

∑ i : S, ∑ j : S, μ i * P i j = 1

Alias of StochasticMatrix.sum_svec_mul_smat_eq_one.

theorem svec_mul_smat_is_svec {S : Type u} [Fintype S] (μ : S → ℝ) [StochasticMatrix.StochasticVec μ] (P : Matrix S S ℝ) [StochasticMatrix.RowStochastic P] :

StochasticMatrix.StochasticVec (Matrix.vecMul μ P)

Alias of StochasticMatrix.svec_mul_smat_is_svec.

def uniformDistribution {S : Type u} [Fintype S] :

S → ℝ

Alias of StochasticMatrix.uniformDistribution.

The uniform probability distribution on a nonempty finite type.

Equations

@uniformDistribution = @StochasticMatrix.uniformDistribution

Instances For

theorem StochasticVec.le_one {S : Type u} [Fintype S] (x : S → ℝ) [StochasticMatrix.StochasticVec x] (s : S) :

x s ≤ 1

Alias of StochasticMatrix.StochasticVec.le_one.

theorem ContinuousLinearMap.condExp_comp {Ω : Type u_1} {α : Type u_2} {β : Type u_3} [MeasurableSpace α] [NormedAddCommGroup α] [NormedSpace ℝ α] [CompleteSpace α] [BorelSpace α] [NormedAddCommGroup β] [NormedSpace ℝ β] [CompleteSpace β] [MeasurableSpace β] [SecondCountableTopology β] [BorelSpace β] {m m₀ : MeasurableSpace Ω} {μ : MeasureTheory.Measure Ω} (hm : m ≤ m₀) [MeasureTheory.SigmaFinite (μ.trim hm)] {f : Ω → α} (hf : MeasureTheory.Integrable f μ) (L : α →L[ℝ ] β) :

μ[⇑L ∘ f | m] =ᵐ[μ] ⇑L ∘ μ[f | m]

Alias of MeasureTheory.ContinuousLinearMap.condExp_comp.

theorem Integrable.finset_sum {α : Type u_2} {ι : Type u_3} {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] {s : Finset ι} {f : ι → α → ℝ} (hf : ∀ i ∈ s, MeasureTheory.Integrable (f i) μ) :

MeasureTheory.Integrable (fun (ω : α) => ∑ i ∈ s, f i ω) μ

Alias of MeasureTheory.Integrable.finset_sum.

theorem EventuallyEq.finset_sum {α : Type u_1} {ι : Type u_2} {β : Type u_3} [AddCommGroup β] {l : Filter α} {s : Finset ι} {f g : ι → α → β} (hfg : ∀ i ∈ s, f i =ᶠ[l] g i) :

∑ i ∈ s, f i =ᶠ[l] ∑ i ∈ s, g i

Alias of Filter.EventuallyEq.finset_sum.

def Kernel.iter (S : Type u) [MeasurableSpace S] (κ : ProbabilityTheory.Kernel S S) :

ℕ → ProbabilityTheory.Kernel S S

Alias of ProbabilityTheory.MarkovChain.Kernel.iter.

Iterates of the transition kernel of a Markov chain.

Equations

Kernel.iter = ProbabilityTheory.MarkovChain.Kernel.iter

Instances For