\documentclass[10pt]{article}
\usepackage{amsfonts,amsmath}
\usepackage{fullpage}
\begin{document}
\input{preamble.tex}%note that this gives a compiler error unless
%the "binom" definition is modified
\lecture{22}{November 24, 2004}{Madhu Sudan}{Paul Valiant} \noindent
\\
\section{Trevisan's Extractor}
The following is Trevisan's construction of an extractor. Consider a code $C:\{0,1\}^n\rightarrow\{0,1\}^N$ that is a $(\frac{1}{2}-\delta,L)$ soft-decision list decoder for some list size $L=\mbox{poly}(\frac{n}{\delta})$. Note that we have seen explicit constructions for codes with $N=\frac{n}{\delta^8}$ and $L\leq N$.
Recall that a $(m,t,l,a)$-\emph{design} is a collection of sets $S_1,...,S_m$ with $S_i\subset [t], |S_i|=l$, and the condition that for every $i,j$, we have \[|S_i\cap S_j|\leq a.\]
We note that Trevisan shows these designs exist if \[t\geq \frac{l^2}{a}\exp\left(\frac{\log m}{a}\right).\]
For $l=\log_2 N$ we define an extractor $E:\{0,1\}^n\times\{0,1\}^t\rightarrow\{0,1\}^m$ as \[E(x,y)=C(x)_{y_1},C(x)_{y_2},...,C(x)_{y_m}\] where $y_i$ is the restriction of $y$ to the set $S_i$, which is a string in $\{0,1\}^l$ and can be interpreted to be an integer in $[N]$.
We provide a proof sketch for the following claim.
\begin{claim} $E$ is a $(k,\epsilon)$ extractor provided $\delta<\frac{\epsilon}{2m}$ and $k>m2^k+O(\log\frac{n}{\epsilon})$.
\end{claim}
\begin{proof} (Sketch) Assume for the sake of contradiction that $A$ \emph{$\epsilon$-distinguishes} a random $x\in X$ for some $X$ with $|X|=2^k$, namely \[\mathop{\mbox{Pr}}_{x\in X,y\in U_t} [A(E(x,y),y)=1]>\mathop{\mbox{Pr}}_{z\in U_m,y\in U_t}[A(z,y)=1]+\epsilon,\] where $U_k$ represents the uniform distribution on strings in $\{0,1\}^k$.
Using standard Markov arguments, one can show that there exists some set $X_{\frac{\epsilon}{2}}\subset X$ of size at least $\frac{\epsilon}{2}|X|=\frac{\epsilon}{2}2^k$ such that $A$ $\frac{\epsilon}{2}$-distinguishes \emph{every} $x\in X_{\frac{\epsilon}{2}}$.
We now show that if one can $\frac{\epsilon}{2}$-distinguish $m$ bits of $E(x,y)$ from the uniform distribution, then one can $\frac{\epsilon}{2m}$-distinguish a single bit from uniform. We use a technique called \emph{hybridization}.
Define distributions $D_0,...,D_m$ as follows: for fixed $x$ and random $y\in U_t$ consider the $m$ pseudo-random bits of $E(x,y)=(C(x)_{y_1},...,C(x)_{y_m})$. Further, let $b_1 ... b_m$ be $m$ truly random bits. Define the distribution $D_i$ by the distribution of \[(C(x)_{y_1},...,C(x)_{y_i-1},b_{i+1},...,b_m).\]
Let \[p_i\eqdef\mathop{\rm Pr}_{(z,y)\in D_i}[A(z,y)=1].\] Note that by definition \[p_m-p_0>\frac{\epsilon}{2}.\] Further, we may expand this into a telescoping sum as \[\frac{\epsilon}{2}\frac{\epsilon}{2m}.\]
Thus for each $x\in X_{\frac{\epsilon}{2}}$, \[\mathop{\rm E}_{y,b}[A(C(x)_{y_1},...,C(x)_{y_{i}},b_{i+1},...,b_{m})-A(C(x)_{y_1},...,C(x)_{y_{i-1}},b_{i},...,b_{m})]>\frac{\epsilon}{2m},\] from which we conclude that for any $x$ there exists a suffix $b_{i+1},...,b_{m}$ such that the above equation holds. Further, for some specific suffix $b_{i+1},...,b_{m}$ the set of strings $x$ for which the above equation holds $X_{\frac{\epsilon}{2},i,b_{i+1},...,b_m}$, is at least $2^{-m}\left|X_{\frac{\epsilon}{2}}\right|\geq \frac{\epsilon}{2} 2^{k-m}$.
Similarly, the above bound is an average over all $y$, but we are concerned only with the digits of $y$ that help determine the bit $C(x)_{y_i}$, namely those digits of $y$ indexed by elements of the set $S_i$. Thus, as above, we conclude that for any $x$ there exists an assignment to $y|_{[t]-S_i}$ such that we can distinguish $x$ as above. We would like to claim as above that a large set of values of $x$ may be distinguished by the same $y$. However, we have the problem that there are way too many possible values for $y$ for such a pidgeonhole arguement to work.
Recall that we have constructed a distinguisher $A'$ such that for every $x\in X_{\frac{\epsilon}{2},i,b_{i+1},...,b_m}$, $A'$ $\frac{\epsilon}{2m}$-distinguishes \[(C(x)_{y_1},...,C(x)_{y_{i-1}},b_i,y)\] from \[(C(x)_{y_1},...,C(x)_{y_{i}},y).\] Thus, instead of finding a set of $x$ that is distinguished by $y|_{[t]-S_i}$, we can find a set of $x$ that is distinguished by the same functions $(C(x)_{y_1},...,C(x)_{y_{i-1}})$, for any values of $C(x)$. Note that this is not as bad as it appears, since we may define each $C(x)_{y_k}$ by its values for each assignment to \[y|{S_i\cap S_k}.\] Since by definition \[|S_i\cap S_k|\leq a,\] we need only define $2^a$ values of $C(x)_{y_k}$. Thus $(C(x)_{y_1},...,C(x)_{y_{i-1}})$ is determined by $m2^a$ values.
Now we can apply the pidgeonhole principle to conclude that there is a distinguisher $A''$ and a set $X_{\frac{\epsilon}{2},i,b_{i+1},...,b_m,\{C(x)_{y_{*\delta$ where $C$ was defined to correct a $\frac{1}{2}-\delta$ fraction of errors, there can be at most $L$ codewords that lie within distance $\frac{1}{2}-\frac{\epsilon}{2m}$ of $d$, and we have the desired contradiction provided that \[|X_{\frac{\epsilon}{2},i,b_{i+1},...,b_m,\{C(x)_{y_{**m2^a+\log {\rm O}\left(\left(\frac{m}{\epsilon}\right)\right)\] then we have a $(k,\epsilon)$ extractor, as desired.
\end{proof}
We now compute some parameters of the above construction.
Suppose we wish to construct a $(k,\epsilon)$ extractor $\{0,1\}^n\times\{0,1\}^t\rightarrow\{0,1\}^m$ where $k=\sqrt{n}$, and $m=k^{1-\gamma}$ for some $\gamma$. From above, we can construct extractors for \[2^a\approx \frac{k}{m} =k^\gamma,\] implying $a\approx \gamma \log k$. Also, for our error-correcting code, $L\approx N=2^l$, so $l\approx \log N$. Recall that we could construct designs for \[t\geq \frac{l^2}{a}\exp\left(\frac{\log m}{a}\right).\] Thus, we have \begin{align*}
t&\approx \frac{l^2}{a}\exp\left(\frac{\log m}{a}\right)\\
&\approx\frac{\log^2 N}{\gamma \log k} \exp\left(\frac{\log k}{\gamma \log k}\right)\\
&\approx \frac{{\rm O}(\log^2 k)}{\gamma \log k} O(1)=O(\log k).\end{align*}
Thus we use only logarithmic additional randomness. We note that we have seen an existence argument for an extractor that uses \[t=\log_2 n+{\rm O}(1)\] additional randomness, while Trevisan's extractor is a construction requiring only \[t={\rm O}(\log n)\] additional randomness.
\section{The Ta-Shma, Zuckerman, Safra extractor}
Let $n$ be the length of the quasi-random source string, and let $q\approx \sqrt{n}$ be prime. We will use a two level construction for the extractor, with a small inner code $C_{\rm small}:\mathbb{F}_q\rightarrow\{0,1\}^l$ constructed so as to be $(\frac{1}{2}-\delta,L$ list decodable used to encode elements in the field $\mathbb{F}_q$.
The ``outer'' component of the construction is to consider the length-$n$ input string as specifying the coefficients of a bivariate polynomial $P$ over $\mathbb{F}_q$ of degree $(\sqrt{n},\sqrt{n})$. For now, it does not matter whether the input string is only $0-1$. The extractor uses the additional randomness to generate an ordered pair $(a,b)\in [\sqrt{n}]\times [\sqrt{n}]$ and an index $j\in [l]$. The output of the extractor is \[E(P,(a,b),j)=C_{\rm small}(P(a,b))_j,...,C_{\rm small}(P(a+m-1,b))_j,\] where $m$ is the length of the desired output.
We present a heuristic sketch of some of the properties of this extractor. This is covered in more detail in the next lecture.
As for the Trevisan extractor, we suppose for the sake of contradiction that there exists a function $A$ that $\epsilon$-distinguishes the output of this extractor for $P$ drawn randomly from some $X$ of size at least $2^k$. As above, by Markov arguments we conclude that $A$ is an $\frac{\epsilon}{2}$-distinguisher for each $x\in X_\frac{\epsilon}{2}$ for some $X_\frac{\epsilon}{2}$ of size at least $\frac{\epsilon}{2}2^k$.
Using hybridization arguments similar to those used in the case for the Trevisan extractor, we conclude that there exists a predictor $B$ for $P(a,b)$ given $(P(a-i,b),...,P(a-1,b),a,b)$ where $i*