The Zhang-Yeung inequality (Theorem 3) #
Theorem 3 of [@zhangyeung1998, §III, eqs. 21-23] is the headline non-Shannon-type
information
inequality of the paper: the first known instance of a linear inequality on the
entropies of four
discrete random variables that does not follow from Shannon's basic inequalities. This
module
closes the Shannon chase that takes the two copy-lemma inputs of M2
(copyLemma_delta_le_mutualInfo_Y₁ and copyLemma_delta_le_mutualInfo_X_X₁) to the
three
equivalent forms of the paper's conclusion (eqs. 21, 22, 23).
Statement of eq. (21):
Δ(Z, U | X, Y) ≤ (1/2)·(I(X; Y) + I(X; ⟨Z, U⟩) + I(Z; U | X) - I(Z; U | Y)).
Its X ↔ Y symmetric partner (eq. 22) and their average (eq. 23) follow by mechanical
algebra on
delta_form22_iff, delta_form23_iff, and delta_form23_of_form21_form22 from
ZhangYeung/Delta.lean.
Main statements #
ZhangYeung.zhangYeung: paper eq. (21), the asymmetric form the copy-lemma chase naturally produces.ZhangYeung.zhangYeung_dual: paper eq. (22), theX ↔ Yswap of eq. (21).ZhangYeung.zhangYeung_averaged: paper eq. (23), the symmetric headline form obtained by averaging (21) and (22).
Implementation notes #
The cleanest proof produces the integer-scaled form first (because the chase naturally
closes at
2·I[Z:U] - 3·I[Z:U|X] - I[Z:U|Y] ≤ I[X:Y] + I[X:⟨Z, U⟩]) and converts to the paper's
(1/2)-scaled form at the end via the M1 delta_form21_iff lemma. A private theorem
zhangYeung_integer captures this intermediate shape, and the three public theorems are
thin
wrappers that route through the form-conversion lemmas of ZhangYeung/Delta.lean.
The two generic Shannon helpers the chase needs now live in ZhangYeung/Prelude.lean:
mutualInfo_add_three_way_identity (the "peeling chain rule twice" identity I[X:Y] + I[X:Z] = I[X:⟨Y,Z⟩] + I[Y:Z] - I[Y:Z|X]) and mutualInfo_le_of_condIndepFun (the
data-processing
inequality CondIndepFun X Y Z μ → I[X:Y] ≤ I[X:Z]). M3 remains their first consumer;
M5 becomes
the second.
Two measurable projection helpers (measurable_pairXZU, measurable_pairXY) package
the specific
4-tuple projections the main chase invokes through IdentDistrib.comp to extract the
pair-level
IdentDistribs consumed by IdentDistrib.mutualInfo_eq. They are private and local
to this
file; their specific shapes are tied to the chase.
The four codomains S₁, S₂, S₃, S₄ of the measured random variables are bound at a
common universe
u, inherited from the copyLemma existential (ZhangYeung/CopyLemma.lean).
References #
- [@zhangyeung1998, §III, Theorem 3, eqs. 21-23] -- see
references/transcriptions/zhangyeung1998.mdfor a verbatim transcription of the theorem statement (line 290) and the proof (lines 680-709), verified 2026-04-16.
Tags #
Shannon entropy, mutual information, non-Shannon information inequality, Zhang-Yeung, data processing
Helper projections #
Two projection-measurability facts local to the Theorem 3 chase. The pure-Shannon
helpers consumed
below now live in ZhangYeung/Prelude.lean.
Main theorems #
Module-scope fixtures for the three public theorems. The four codomains are bound at a
single
universe u because the copyLemma existential consumed by zhangYeung_integer is
universe-bound.
Theorem 3 of [@zhangyeung1998, §III, eq. 21] -- the Zhang-Yeung inequality in the asymmetric form the copy-lemma proof naturally produces:
Δ(Z, U | X, Y) ≤ (1/2)·(I(X; Y) + I(X; ⟨Z, U⟩) + I(Z; U | X) - I(Z; U | Y)).
This is the first known non-Shannon-type information inequality, in the form paper eq. (21) states.
The X ↔ Y dual of Theorem 3 [@zhangyeung1998, §III, eq. 22]:
Δ(Z, U | X, Y) ≤ (1/2)·(I(X; Y) + I(Y; ⟨Z, U⟩) - I(Z; U | X) + I(Z; U | Y)).
Obtained from zhangYeung by swapping X and Y in the hypotheses, using
delta_comm_cond to
rewrite the left-hand side back to the original delta, and mutualInfo_comm to rewrite
I[Y : X]
as I[X : Y].
The averaged symmetric form of Theorem 3 [@zhangyeung1998, §III, eq. 23]:
Δ(Z, U | X, Y) ≤ (1/2)·I(X; Y) + (1/4)·(I(X; ⟨Z, U⟩) + I(Y; ⟨Z, U⟩)).
The paper highlights this as Theorem 3's headline statement; it follows by averaging
eqs. (21) and
(22) via the M1 delta_form23_of_form21_form22 averaging lemma.