LeanPool.FormalLearningTheory.Complexity.Generalization.Tail #
Extract block j from a flat array of k*m elements, using finProdFinEquiv.
Equations
- blockExtract k m S j i = S (finProdFinEquiv (j, i))
Instances For
Block index sets are disjoint for distinct blocks.
Block extraction is measurable: extracting block j from a pi-type is measurable.
Block extractions are independent under the product measure. Key infrastructure for boosting (D4) and probability amplification.
Shattering lifting: if T is shattered by C and f : ↥T → Bool, then there exists c ∈ C that agrees with f on all of T.
Key counting lemma: for any h : ↥T → Bool on a shattered set T with |T| ≥ 2, there exists c ∈ C with #{x ∈ T | c x ≠ h x} > |T|/4. Lifts exists_many_disagreements through shattering.
NFL per-sample lemma for shattered sets: for ANY fixed sample xs and ANY hypothesis h, there exists c ∈ C agreeing with h on sample points but with high error (> 1 / 4) on the shattered set T. Uses the counting argument on unseen points via disagreement_sum_eq.
If VCDim = ⊤, then C is not PAC learnable. Proof: for any learner L with sample function mf, pick ε = 1 / 4, δ = 1 / 4. Let m = mf(1 / 4, 1 / 4). Since VCDim = ⊤, ∃ shattered set S with |S| ≥ 2m. Put D = uniform on S. For random labeling, any m-sample learner has expected error ≥ 1 / 4 on unseen points. This is the core of pac_imp_vcdim_finite (contrapositive direction).