Documentation

LeanPool.ZhangYeungInequality.Theorem3

The Zhang-Yeung inequality (Theorem 3) #

Theorem 3 of [@zhangyeung1998, §III, eqs. 21-23] is the headline non-Shannon-type information inequality of the paper: the first known instance of a linear inequality on the entropies of four discrete random variables that does not follow from Shannon's basic inequalities. This module closes the Shannon chase that takes the two copy-lemma inputs of M2 (copyLemma_delta_le_mutualInfo_Y₁ and copyLemma_delta_le_mutualInfo_X_X₁) to the three equivalent forms of the paper's conclusion (eqs. 21, 22, 23).

Statement of eq. (21):

Δ(Z, U | X, Y) ≤ (1/2)·(I(X; Y) + I(X; ⟨Z, U⟩) + I(Z; U | X) - I(Z; U | Y)).

Its X ↔ Y symmetric partner (eq. 22) and their average (eq. 23) follow by mechanical algebra on delta_form22_iff, delta_form23_iff, and delta_form23_of_form21_form22 from ZhangYeung/Delta.lean.

Main statements #

Implementation notes #

The cleanest proof produces the integer-scaled form first (because the chase naturally closes at 2·I[Z:U] - 3·I[Z:U|X] - I[Z:U|Y] ≤ I[X:Y] + I[X:⟨Z, U⟩]) and converts to the paper's (1/2)-scaled form at the end via the M1 delta_form21_iff lemma. A private theorem zhangYeung_integer captures this intermediate shape, and the three public theorems are thin wrappers that route through the form-conversion lemmas of ZhangYeung/Delta.lean.

The two generic Shannon helpers the chase needs now live in ZhangYeung/Prelude.lean: mutualInfo_add_three_way_identity (the "peeling chain rule twice" identity I[X:Y] + I[X:Z] = I[X:⟨Y,Z⟩] + I[Y:Z] - I[Y:Z|X]) and mutualInfo_le_of_condIndepFun (the data-processing inequality CondIndepFun X Y Z μ → I[X:Y] ≤ I[X:Z]). M3 remains their first consumer; M5 becomes the second.

Two measurable projection helpers (measurable_pairXZU, measurable_pairXY) package the specific 4-tuple projections the main chase invokes through IdentDistrib.comp to extract the pair-level IdentDistribs consumed by IdentDistrib.mutualInfo_eq. They are private and local to this file; their specific shapes are tied to the chase.

The four codomains S₁, S₂, S₃, S₄ of the measured random variables are bound at a common universe u, inherited from the copyLemma existential (ZhangYeung/CopyLemma.lean).

References #

Tags #

Shannon entropy, mutual information, non-Shannon information inequality, Zhang-Yeung, data processing

Helper projections #

Two projection-measurability facts local to the Theorem 3 chase. The pure-Shannon helpers consumed below now live in ZhangYeung/Prelude.lean.

Main theorems #

Module-scope fixtures for the three public theorems. The four codomains are bound at a single universe u because the copyLemma existential consumed by zhangYeung_integer is universe-bound.

theorem ZhangYeung.zhangYeung {Ω : Type u_1} [MeasurableSpace Ω] {S₁ S₂ S₃ S₄ : Type u} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] [Finite S₁] [Finite S₂] [Finite S₃] [Finite S₄] [MeasurableSingletonClass S₁] [MeasurableSingletonClass S₂] [MeasurableSingletonClass S₃] [MeasurableSingletonClass S₄] {X : ΩS₁} {Y : ΩS₂} {Z : ΩS₃} {U : ΩS₄} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (hU : Measurable U) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
delta Z U X Y μ 1 / 2 * (I[X : Y; μ] + I[X : prod Z U; μ] + I[Z : U|X;μ] - I[Z : U|Y;μ])

Theorem 3 of [@zhangyeung1998, §III, eq. 21] -- the Zhang-Yeung inequality in the asymmetric form the copy-lemma proof naturally produces:

Δ(Z, U | X, Y) ≤ (1/2)·(I(X; Y) + I(X; ⟨Z, U⟩) + I(Z; U | X) - I(Z; U | Y)).

This is the first known non-Shannon-type information inequality, in the form paper eq. (21) states.

theorem ZhangYeung.zhangYeung_dual {Ω : Type u_1} [MeasurableSpace Ω] {S₁ S₂ S₃ S₄ : Type u} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] [Finite S₁] [Finite S₂] [Finite S₃] [Finite S₄] [MeasurableSingletonClass S₁] [MeasurableSingletonClass S₂] [MeasurableSingletonClass S₃] [MeasurableSingletonClass S₄] {X : ΩS₁} {Y : ΩS₂} {Z : ΩS₃} {U : ΩS₄} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (hU : Measurable U) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
delta Z U X Y μ 1 / 2 * (I[X : Y; μ] + I[Y : prod Z U; μ] - I[Z : U|X;μ] + I[Z : U|Y;μ])

The X ↔ Y dual of Theorem 3 [@zhangyeung1998, §III, eq. 22]:

Δ(Z, U | X, Y) ≤ (1/2)·(I(X; Y) + I(Y; ⟨Z, U⟩) - I(Z; U | X) + I(Z; U | Y)).

Obtained from zhangYeung by swapping X and Y in the hypotheses, using delta_comm_cond to rewrite the left-hand side back to the original delta, and mutualInfo_comm to rewrite I[Y : X] as I[X : Y].

theorem ZhangYeung.zhangYeung_averaged {Ω : Type u_1} [MeasurableSpace Ω] {S₁ S₂ S₃ S₄ : Type u} [MeasurableSpace S₁] [MeasurableSpace S₂] [MeasurableSpace S₃] [MeasurableSpace S₄] [Finite S₁] [Finite S₂] [Finite S₃] [Finite S₄] [MeasurableSingletonClass S₁] [MeasurableSingletonClass S₂] [MeasurableSingletonClass S₃] [MeasurableSingletonClass S₄] {X : ΩS₁} {Y : ΩS₂} {Z : ΩS₃} {U : ΩS₄} (hX : Measurable X) (hY : Measurable Y) (hZ : Measurable Z) (hU : Measurable U) (μ : MeasureTheory.Measure Ω) [MeasureTheory.IsProbabilityMeasure μ] :
delta Z U X Y μ 1 / 2 * I[X : Y; μ] + 1 / 4 * (I[X : prod Z U; μ] + I[Y : prod Z U; μ])

The averaged symmetric form of Theorem 3 [@zhangyeung1998, §III, eq. 23]:

Δ(Z, U | X, Y) ≤ (1/2)·I(X; Y) + (1/4)·(I(X; ⟨Z, U⟩) + I(Y; ⟨Z, U⟩)).

The paper highlights this as Theorem 3's headline statement; it follows by averaging eqs. (21) and (22) via the M1 delta_form23_of_form21_form22 averaging lemma.