Documentation

LeanPool.RlTheoryInLean

RL Theory in Lean #

Source: arxiv:2511.03618 Authors: Shangtong Zhang Status: verified Main declarations: StochasticMatrix.stationary_distribution_exists Tags: probability, reinforcement-learning, stochastic-matrices MSC: 62L20, 60J10

Provenance #

Imported from https://github.com/ShangtongZhang/rl-theory-in-lean (MIT-licensed upstream; relicensed into Lean Pool under Apache 2.0 with the upstream author's copyright preserved). Accompanies the paper Towards Formalizing Reinforcement Learning Theory (arXiv:2511.03618). Ported from Lean v4.28.0-rc1 to Lean Pool's v4.30.0-rc2. This Lean Pool import keeps the warning-clean core infrastructure.

Mathematical overview #

The project mirrors Mathlib's directory layout and develops stochastic row-stochastic matrices, Doeblin minorization and geometric mixing for finite chains, finite Markov-chain kernels and trajectory measures, measure/kernel helper lemmas, and discrete Gronwall inequalities used in stochastic approximation.

Alias of StochasticMatrix.Aperiodic.


A stochastic matrix is aperiodic if for each state, the GCD of return times is 1

Equations
Instances For

    Alias of StochasticMatrix.DoeblinMinorization.


    A Doeblin minorization for a finite stochastic matrix.

    Equations
    Instances For

      Alias of StochasticMatrix.GeometricMixing.


      Geometric convergence to stationarity in total variation/L¹ distance.

      Equations
      Instances For
        def RowStochastic {S : Type u} [Fintype S] (P : Matrix S S ) :

        Alias of StochasticMatrix.RowStochastic.


        A matrix whose rows are stochastic vectors.

        Equations
        Instances For
          def Stationary {S : Type u} [Fintype S] (μ : S) (P : Matrix S S ) :

          Alias of StochasticMatrix.Stationary.


          A stationary distribution for a matrix.

          Equations
          Instances For
            def cesaroAverage {S : Type u} [Fintype S] [DecidableEq S] (x₀ : S) (P : Matrix S S ) (n : ) :
            S

            Alias of StochasticMatrix.cesaroAverage.


            The Cesaro average of the first n + 1 iterates of a stochastic vector.

            Equations
            Instances For
              theorem chapman_kolmogorov_eq_ge {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ) [StochasticMatrix.RowStochastic P] (m n : ) (i j k : S) :
              (P ^ (m + n)) i j (P ^ m) i k * (P ^ n) k j

              Alias of StochasticMatrix.chapman_kolmogorov_eq_ge.

              def returnTimes {S : Type u} [Fintype S] [DecidableEq S] (P : Matrix S S ) (i : S) :

              Alias of StochasticMatrix.returnTimes.


              The set of positive return times for state i

              Equations
              Instances For

                Alias of StochasticMatrix.return_times_add_mem.


                Return times are closed under addition (used via AddSubmonoid.closure)

                Alias of StochasticMatrix.smatAsOperator.


                The affine action of a stochastic matrix on the probability simplex.

                Equations
                Instances For
                  theorem sum_svec_mul_smat_eq_one {S : Type u} [Fintype S] (μ : S) [StochasticMatrix.StochasticVec μ] (P : Matrix S S ) [StochasticMatrix.RowStochastic P] :
                  i : S, j : S, μ i * P i j = 1

                  Alias of StochasticMatrix.sum_svec_mul_smat_eq_one.

                  def uniformDistribution {S : Type u} [Fintype S] :
                  S

                  Alias of StochasticMatrix.uniformDistribution.


                  The uniform probability distribution on a nonempty finite type.

                  Equations
                  Instances For
                    theorem Integrable.finset_sum {α : Type u_2} {ι : Type u_3} {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} [MeasureTheory.IsFiniteMeasure μ] {s : Finset ι} {f : ια} (hf : is, MeasureTheory.Integrable (f i) μ) :
                    MeasureTheory.Integrable (fun (ω : α) => is, f i ω) μ

                    Alias of MeasureTheory.Integrable.finset_sum.

                    theorem EventuallyEq.finset_sum {α : Type u_1} {ι : Type u_2} {β : Type u_3} [AddCommGroup β] {l : Filter α} {s : Finset ι} {f g : ιαβ} (hfg : is, f i =ᶠ[l] g i) :
                    is, f i =ᶠ[l] is, g i

                    Alias of Filter.EventuallyEq.finset_sum.

                    Alias of ProbabilityTheory.MarkovChain.Kernel.iter.


                    Iterates of the transition kernel of a Markov chain.

                    Equations
                    Instances For