Regularity, Boosting, and
Efficiently Simulating Every High-Entropy Distribution

Luca Trevisan, Madhur Tulsiani, and Salil Vadhan


We show that every bounded function g: {0,1}^n -> [0,1] admits an efficiently computable "simulator" function h: {0,1}^n->[0,1] such that every fixed polynomial size circuit has approximately the same correlation with g as with h. If g describes (up to scaling) a high min-entropy distribution D, then h can be used to efficiently sample a distribution D' of the same min-entropy that is indistinguishable from D by circuits of fixed polynomial size. We state and prove our result in a more abstract setting, in which we allow arbitrary finite domains instead of {0,1}^n, and arbitrary families of distinguishers, instead of fixed polynomial size circuits. Our result implies (a) the Weak Szemeredi Regularity Lemma of Frieze and Kannan (b) a constructive version of the Dense Model Theorem of Green, Tao and Ziegler with better quantitative parameters (polynomial rather than exponential in the distinguishing probability), and (c) the Impagliazzo Hardcore Set Lemma. It appears to be the general result underlying the known connections between "regularity" results in graph theory, "decomposition" results in additive combinatorics, and the Hardcore Lemma in complexity theory. We present two proofs of our result, one in the spirit of Nisan's proof of the Hardcore Lemma via duality of linear programming, and one similar to Impagliazzo's "boosting" proof. A third proof by iterative partitioning, which gives the complexity of the sampler to be exponential in the distinguishing probability, is also implicit in the Green-Tao-Ziegler proofs of the Dense Model Theorem.


 [ back to Salil Vadhan's research]