\documentclass[10pt]{article}
\usepackage{amsfonts,amsthm,amsmath,amssymb}
\usepackage{array}
\usepackage{epsfig}
\usepackage{fullpage}
\def\on{\operatorname}
\def\FF{\mathbb F}
\def\eps{\varepsilon}
\begin{document}
\input{preamble.tex}
\renewcommand{\binset}{\bbF_2}
\handout{CS 229r Essential Coding Theory, Lecture 13}{Mar 7, 2017}{Instructor: Madhu Sudan}{Scribe: Alexander Wei}{Lecture 13}
\section{Overview}
Lecture today will cover graph-theoretic codes, as well as linear time encoding and decoding. So far, we have seen codes with relatively efficient runtimes: For example, the best known algorithms for the Reed-Solomon code can encode and decode erasures in time O$(n\log n)$ and list-decode in time $O(n\on{polylog}(n))$. Thus, a natural question to ask is whether we can improve on this bound and obtain codes with linear time encoding and decoding. It turns out that we can, but at a price. We can correct $\Omega(1)$-fraction errors with rate $R > 0$ using the \textbf{low-density parity-check} (LDPC) codes constructed by Sipser-Spielman and Spielman. The state-of-the-art, due a result of Guruswami and Indyk, is codes that correct $\frac\delta2$ errors uniquely with $R = 1-\delta$ over large alphabets.
There are two classes of graph-theoretic codes---based on whether the graph represents the parity-check matrix or the generator matrix of the code. (In particular, all graph-theoretic codes are linear codes.) Today, we'll look at the former class of graph-theoretic codes---in particular, expander codes.
\section{Expander Codes}
Consider a bipartite graph $B$ with $N$ vertices on the left and $M$ vertices on the right. We let $[N]$ and $[M]$ denote the vertices on the left and right, respectively. We now describe how $B$ gives rise to a linear code $C_B$ over $\FF_2$. We define the codewords of $C_B$ to be the binary strings $x_1x_2\cdots x_N$ such that
\[ \bigoplus_{i\leftrightarrow j} x_i = 0 \]
for all $j\in [M]$. (Here, the $\leftrightarrow$ symbol denotes adjacency.) In other words, the parity-check matrix is the adjacency matrix of the bipartite graph, where the rows corresponds to vertices in $[N]$ and the columns correspond to vertices in $[M]$. By definition, the rate of this code is $1 - M / N$.
To analyze the distance of this code, we now define some notation. For $S\subseteq [N]$, let $\Gamma(S)$ denote the neighborhood of $S$. In addition, define
\[ \Gamma^{\on{odd}}(S) := \{j\mid \text{$\#\{i\mid i\leftrightarrow j,i\in S\}$ is odd}\} \]
and
\[ \Gamma^{\on{unique}}(S) := \{j\mid\exists!\text{ $i\in S$ such that $i\leftrightarrow j$}\}. \]
By definition, $C_B$ has distance $D$ if and only if $\Gamma^{\on{odd}}(S)\neq\emptyset$ for all subsets $S\subseteq [N]$ of size less than $D$. However, $\Gamma^{\on{odd}}$ is rather difficult to understand. But we do know that $\Gamma^{\on{odd}}(S)\supseteq\Gamma^{\on{unique}}(S)$. Because $\Gamma^{\on{unique}}$ is more feasible to study, we bound the distance of $C_B$ by showing that $\Gamma^{\on{unique}}(S)\neq\emptyset$ for all $S\subseteq [N]$ such that $\abs S < D$.
We now define some graph-theoretic notions that will yield codes with good distance:
\begin{definition}
A bipartite graph $B$ with $N$ vertices on the left-hand side and $M$ vertices on the right-hand side is \textbf{$(c,d)$-regular} if every $i\in [N]$ has degree $c$ and ever $j\in [M]$ has degree $d$.
\end{definition}
We restrict our attention to regular bipartite graphs, since they have the following nice ``expansion'' property: For $S\subseteq [N]$, notice that $\abs{\Gamma(S)}\le c\abs S$, and if equality holds, then $\Gamma^{\on{unique}}(S) = c\abs S$. We'll also want our bipartite graph to have a similar property under slightly weaker constraints. This motivates the following two definitions involving the ``expansion'' property:
\begin{definition}
A $(c,d)$-regular graph is an \textbf{$(\alpha,\delta)$-expander} if for all $S\subseteq [N]$ such that $\abs S\le\delta N$, we have
\[ \abs{\Gamma(S)}\ge \alpha\cdot c\abs S. \]
\end{definition}
\begin{definition}
A $(c,d)$-regular graph is an \textbf{$(\alpha,\delta)$-unique expander} if for all $S\subseteq [N]$ such that $\abs S\le\delta N$, we have
\[ \abs{\Gamma^{\on{unique}}(S)}\ge \alpha\cdot c\abs{S}. \]
\end{definition}
Note that for expander graphs, we'll want $\alpha$ to be as close to $1$ as possible.
\begin{lemma}
Suppose $B$ is a $(c,d)$-regular bipartite graph. If $B$ is an $(\alpha,\delta)$-expander, then $B$ is also a $(2\alpha-1,\delta)$-unique expander.
\end{lemma}
\begin{proof}
Fix $S\subseteq [N]$ of size at most $\delta N$. Let $U = \Gamma^{\on{unique}}(S)$ and $T = \Gamma(S)\setminus U$. Notice that $c\abs S\ge \abs U = 2\abs T$ and that $2\abs U + 2\abs T\ge 2\alpha\cdot c\abs S$. Subtracting the first expression from the second gives us $\abs U\ge (2\alpha -1)\cdot c\abs S$ as desired.
\end{proof}
\begin{theorem}
Suppose $B$ is a $(c,d)$-regular bipartite graph that is also an $(\alpha,\delta)$-expander with $\alpha > 1/2$. Then $C_B$ is a code with relative distance at least $\delta$.
\end{theorem}
Such codes, built using bipartite expander graphs, are known as \textbf{expander codes}.
\begin{exercise}
Verify that Theorem 1 follows from the above lemma.
\end{exercise}
\section{A Bit of History}
The first graph-theoretic codes were proposed in 1963 by Gallagher, but due to the limited computational resources at the time, his codes weren't able to be implemented, and so the nice properties of his codes went unnoticed. In 1984, Tanner rediscovered and generalized graph-theoretic codes, replacing the parity-check constraint with more general linear codes, and also brought up the idea of expander graphs. This idea of graph-theoretic codes was again rediscovered by Sipser and Spielman, who put together error-correcting codes and expander graphs. But at that time, explicit constructions of expander graphs with $\alpha > 1/2$ were not yet known. It was only later, in a result due to Capalbo, Reingold, Vadhan, and Wigderson, that the construction of good expanders (with $\alpha > 1/2$) were discovered.
\section{Decoding Expander Codes}
To decode an encoded message, suppose we have a message $x_1x_2\cdots x_N$ with at most a fixed number of errors. We decode this message using the FLIP algorithm, which is specified as follows:
Call a vertex $j\in [M]$ \emph{satisfied} if the parity condition involving $j$ is satisfied by $x_1x_2\cdots x_N$. Otherwise, $j$ is \emph{unsatisfied}. For each $i\in [N]$, we count the number of satisfied and unsatisfied neighbors of $i$. If $i$ has more unsatisfied neighbors than satisfied neighbors, we flip the parity of $x_i$. We repeat this process until all constraints are satisfied; that is, until $x_1x_2\cdots x_N$ becomes a codeword. If not all constraints are satisfied, but no flip will decrease the number of unsatisfied vertices, then we also terminate.
\begin{theorem}
Suppose $\eps < \frac{\delta}{1+c}$ and $\alpha > \frac 34$. If there are at most $\eps N$ errors, then the FLIP algorithm will run for $O(N)$ iterations and terminate with $x_1x_2\cdots x_n$ being the nearest codeword.
\end{theorem}
\begin{proof}
It is clear that this algorithm will run for at most $N > M$ iterations, because the number of unsatisfied vertices decreases with each iteration. To prove the rest of the theorem, we proving a stronger bound on the number of flipping iterations and then use the fact that the total distance from the initial codeword is at most
\[ \eps n + (\text{\# of iterations}) < \delta n. \]
Let $S$ be the set of flipped bits. For a vertex $j\in [M]$ to be unsatisfied, it must be in the expansion of one of the flipped bits. In other words, we must have $j\in\Gamma(S)$. Because $\abs{\Gamma(S)} < c\abs S$, there are at most $\eps nc$ unsatisfied vertices in $[M]$. This observation implies the algorithm runs for at most $\eps nc$ iterations for $\eps$ such that $\eps n(1+c) < \delta n$.
It remains to show we always end at a codeword---that is, the algorithm never gets stuck with unsatisfied constraints. Making this property hold is where the constraint on $\alpha$ comes in. Again, let $S$ be the set of flipped bits, with $\delta n\ge\abs S$. Observe that the lemma gives us
\[ \Gamma^{\on{unique}}(S)\ge (2\alpha -1)c\abs S, \]
which implies there exists an $i\in S$ with at least $(2\alpha-1)c$ unique neighbors. Now, if $\alpha > \frac 34$ holds, then $(2\alpha-1)c > c/2$. Hence $i$ has a majority of its neighbors unsatisfied, which means should be flipped.
\end{proof}
It is not difficult to show that with the right data structures, this algorithm decodes expander codes in linear time. With parallelization, it is even possible to achieve decoding that uses only $O(\log n)$ rounds of flipping. Recently, Viderman also showed that it is possible to decode in linear time when $\alpha > \frac 23$. Finally, note that with families of expander codes, only $n$ goes off to infinity; all of the other parameters else remain constant.
\begin{exercise}
Show that the decoding algorithm we described indeed takes linear time.
\end{exercise}
\section{Spielman Codes}
We just showed that we can decode in linear time with expander codes, but it turns out that encoding is hard to do efficiently, since only the parity-check matrix is defined. However, Spielman came up with the \textbf{superconcentrator codes}, which have both linear time encoding and linear time decoding. We give an overview of such a code for a fixed rate of $R = 1/4$.
Spielman's code for message length $k$ are defined recursively in terms of the code for shorter message lengths. Given a message $x_1x_2\cdots x_k\in\{0,1\}^k$, the first $k$ bits of the encoded message is the message itself. We then compute $k/2$ parity-check bits. We use Spielman's code for message length $k/2$ and rate $1/4$ to encode the parity-check bits into $2k$ bits total. This encoded string forms the next $2k$ bits of our encoded message. The final $k$ bits of the encoded message are $k$ parity-check bits computed from this length $2k$ encoding of the parity-check bits. Observe that the runtime of the encoding algorithm is a recurrence that solves to $O(k)$.
We obtain the parity bits in the above construction from a linear code where the adjacency matrix of a sparse bipartite expander acts as the generator. We do the same for the parity check bits of the length $2k$ string. The codes used here are known as \textbf{error reduction codes}. If all the parity-check symbols are correct, and the number of message symbols incorrect is at most $\eps k$, then our previous flipping algorithm will correct all errors. We can even relax this a little---if the number of parity check-symbols is less than $\tau k$ and the number of message symbols incorrect is at most $\eps k$, then the flip algorithm corrects all but $\frac 12\tau k$ errors. This follows from executing the previous analysis a little more carefully with $\alpha > \frac 78$. Applying this fact lets us achieve a more robust decoding. By solving the recurrence, note that the runtime of the decoding algorithm is also linear in $k$.
\end{document}