Documentation

LeanPool.ZhangYeungInequality.Delta

The Zhang-Yeung delta quantity #

The Zhang-Yeung delta

Δ(Z, U | X, Y) := I(Z; U) - I(Z; U | X) - I(Z; U | Y)

is the central quantity of the Zhang-Yeung conditional information inequality [@zhangyeung1998]. The main theorem of the paper states an upper bound on this quantity; that bound is a non-Shannon information inequality whose proof rests on the copy lemma and lives in a later milestone. This module only captures the equational content: the definition, the symmetries induced by commutativity of mutual information, the expansion into raw entropy terms, and the linear-arithmetic interrelation of the three shapes in which the paper presents the main inequality (equations 21, 22, and 23).

Main definitions #

Main statements #

Implementation notes #

The four codomains S₁ S₂ S₃ S₄ of the random variables live under finite-alphabet specializations [Finite Sᵢ] + [MeasurableSingletonClass Sᵢ]. This matches PFR's downstream style more closely: the public theorem statements expose only finiteness, while proofs recover concrete enumerations through typeclass search when PFR's entropy lemmas need them. The variable blocks are staged: the definition and the purely algebraic lemmas only need [MeasurableSpace Sᵢ]; downstream lemmas live inside two nested sections, an outer one adding the fixture on the measured codomains S₁, S₂ for the symmetry and bounding lemmas, and a nested inner one extending it to the conditioning codomains S₃, S₄ for the entropy-expansion lemma.

No notation Δ[Z : U | X, Y; μ] is introduced; plain function application delta Z U X Y μ suffices for the uses anticipated in the current milestone. The decision to introduce notation is deferred until a later milestone whose proofs exercise delta heavily enough to warrant it.

The delta_self lemma handles only the literal repeated-conditioner case X = Y. Bridging Δ(Z, U | X, X₁) where X₁ is merely a copy of X requires a separate transport lemma for condMutualInfo (under the copy construction's IdentDistrib hypotheses), which is out of scope for this module.

References #

Tags #

Shannon entropy, mutual information, non-Shannon information inequality, Zhang-Yeung

noncomputable def ZhangYeung.delta {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (Y : ΩS₄) (μ : MeasureTheory.Measure Ω := by volume_tac) :

The Zhang-Yeung delta Δ(Z, U | X, Y) := I(Z; U) - I(Z; U | X) - I(Z; U | Y). This is the central quantity of [@zhangyeung1998, §III, eqs. 20-23]; the main inequality of the paper bounds it from above by a Shannon-type expression in the four random variables, but that bound is a non-Shannon information inequality proved via the copy lemma and is not part of this definition.

Equations
Instances For
    theorem ZhangYeung.delta_def {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (Y : ΩS₄) (μ : MeasureTheory.Measure Ω) :
    delta Z U X Y μ = I[Z : U; μ] - I[Z : U|X;μ] - I[Z : U|Y;μ]

    Definitional unfolding of delta.

    theorem ZhangYeung.delta_comm_cond {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (Y : ΩS₄) (μ : MeasureTheory.Measure Ω) :
    delta Z U X Y μ = delta Z U Y X μ

    Swapping the two conditioning arguments leaves delta unchanged.

    theorem ZhangYeung.delta_self {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (μ : MeasureTheory.Measure Ω) :
    delta Z U X X μ = I[Z : U; μ] - 2 * I[Z : U|X;μ]

    The case X = Y: Δ(Z, U | X, X) = I(Z; U) - 2·I(Z; U | X).

    theorem ZhangYeung.delta_form21_iff {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (Y : ΩS₄) (μ : MeasureTheory.Measure Ω) :
    2 * delta Z U X Y μ I[X : Y; μ] + I[X : prod Z U; μ] + I[Z : U|X;μ] - I[Z : U|Y;μ] 2 * I[Z : U; μ] - 3 * I[Z : U|X;μ] - I[Z : U|Y;μ] I[X : Y; μ] + I[X : prod Z U; μ]

    Paper eq. (21) of [@zhangyeung1998, §III]: the inequality 2·Δ(Z, U | X, Y) ≤ I(X;Y) + I(X;ZU) + I(Z;U|X) - I(Z;U|Y) is equivalent to the compact form 2·I(Z;U) - 3·I(Z;U|X) - I(Z;U|Y) ≤ I(X;Y) + I(X;ZU), which is the shape a copy-lemma proof naturally produces.

    theorem ZhangYeung.delta_form22_iff {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (Y : ΩS₄) (μ : MeasureTheory.Measure Ω) :
    2 * delta Z U X Y μ I[X : Y; μ] + I[Y : prod Z U; μ] - I[Z : U|X;μ] + I[Z : U|Y;μ] 2 * I[Z : U; μ] - I[Z : U|X;μ] - 3 * I[Z : U|Y;μ] I[X : Y; μ] + I[Y : prod Z U; μ]

    Paper eq. (22) of [@zhangyeung1998, §III]: the X ↔ Y swap of delta_form21_iff.

    theorem ZhangYeung.delta_form23_of_form21_form22 {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] {Z : ΩS₁} {U : ΩS₂} {X : ΩS₃} {Y : ΩS₄} {μ : MeasureTheory.Measure Ω} (h21 : 2 * delta Z U X Y μ I[X : Y; μ] + I[X : fun (ω : Ω) => (Z ω, U ω); μ] + I[Z : U|X;μ] - I[Z : U|Y;μ]) (h22 : 2 * delta Z U X Y μ I[X : Y; μ] + I[Y : fun (ω : Ω) => (Z ω, U ω); μ] - I[Z : U|X;μ] + I[Z : U|Y;μ]) :
    4 * delta Z U X Y μ 2 * I[X : Y; μ] + I[X : fun (ω : Ω) => (Z ω, U ω); μ] + I[Y : fun (ω : Ω) => (Z ω, U ω); μ]

    Paper eq. (23) of [@zhangyeung1998, §III], the symmetric form of Theorem 3, follows from eqs. (21) and (22) by averaging. This lemma contains no measure-theoretic content; the inequalities (21) and (22) are the nontrivial inputs and are proved in a later milestone via the copy lemma.

    theorem ZhangYeung.delta_form23_iff {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] (Z : ΩS₁) (U : ΩS₂) (X : ΩS₃) (Y : ΩS₄) (μ : MeasureTheory.Measure Ω) :
    4 * delta Z U X Y μ 2 * I[X : Y; μ] + I[X : prod Z U; μ] + I[Y : prod Z U; μ] delta Z U X Y μ 1 / 2 * I[X : Y; μ] + 1 / 4 * (I[X : prod Z U; μ] + I[Y : prod Z U; μ])

    The integer-scaled conclusion of delta_form23_of_form21_form22 is equivalent to the paper's 1/2 and 1/4 statement [@zhangyeung1998, §III, eq. 23].

    Lemmas requiring finite-alphabet structure #

    The remaining lemmas rely on PFR's commutativity and entropy-expansion results, which are stated under discrete/countable hypotheses on the codomains of the measured random variables. An outer section fixes [Finite Sᵢ] and [MeasurableSingletonClass Sᵢ] on the measured pair S₁, S₂ for the symmetry and bounding lemmas; a nested inner section extends the same fixtures to the conditioning codomains S₃, S₄ for the entropy-expansion lemma. Each fixture supplies the relevant FiniteRange obligations via the instance {Ω G : Type*} (X : Ω → G) [Finite G] : FiniteRange X.

    theorem ZhangYeung.delta_comm_main {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] [Finite S₁] [Finite S₂] [MeasurableSingletonClass S₁] [MeasurableSingletonClass S₂] {Z : ΩS₁} {U : ΩS₂} (hZ : Measurable Z) (hU : Measurable U) {X : ΩS₃} {Y : ΩS₄} (μ : MeasureTheory.Measure Ω) :
    delta Z U X Y μ = delta U Z X Y μ

    Swapping the two measured arguments leaves delta unchanged, via mutualInfo_comm and condMutualInfo_comm.

    theorem ZhangYeung.delta_le_mutualInfo {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] [Finite S₁] [Finite S₂] [MeasurableSingletonClass S₁] [MeasurableSingletonClass S₂] {Z : ΩS₁} {U : ΩS₂} (hZ : Measurable Z) (hU : Measurable U) {X : ΩS₃} {Y : ΩS₄} (μ : MeasureTheory.Measure Ω) :
    delta Z U X Y μ I[Z : U; μ]

    Δ(Z, U | X, Y) ≤ I(Z; U): the delta is bounded above by the unconditional mutual information, since conditional mutual information is non-negative.

    theorem ZhangYeung.delta_eq_entropy {Ω : Type u_1} [MeasurableSpace Ω] {S₁ : Type u_2} {S₂ : Type u_3} {S₃ : Type u_4} {S₄ : Type u_5} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] [Finite S₁] [Finite S₂] [MeasurableSingletonClass S₁] [MeasurableSingletonClass S₂] [Finite S₃] [Finite S₄] [MeasurableSingletonClass S₃] [MeasurableSingletonClass S₄] {Z : ΩS₁} {U : ΩS₂} {X : ΩS₃} {Y : ΩS₄} (hZ : Measurable Z) (hU : Measurable U) (hX : Measurable X) (hY : Measurable Y) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
    delta Z U X Y μ = H[Z; μ] + H[U; μ] - H[prod Z U; μ] - (H[Z | X; μ] + H[U | X; μ] - H[prod Z U | X; μ]) - (H[Z | Y; μ] + H[U | Y; μ] - H[prod Z U | Y; μ])

    Expand delta all the way down to raw entropy terms, using mutualInfo_def and condMutualInfo_eq. This is the bridge to any reasoning at the entropy layer directly (for example, evaluating delta on a concrete four-variable distribution when checking bounds or building counterexamples).