\documentclass{article}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{libertine-type1}
\usepackage{helvet}
\usepackage[libertine]{newtxmath}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{geometry}
\geometry{verbose,tmargin=1in,bmargin=1in,lmargin=1in,rmargin=1in}
\usepackage{graphicx}
\usepackage{hyperref}
\hypersetup{colorlinks=true,citecolor=blue,urlcolor =black,linkbordercolor={1 0 0}}
\usepackage[linewidth=1pt]{mdframed}
\hypersetup{colorlinks=true,citecolor=blue,urlcolor =black,linkbordercolor={1 0 0}}
\makeatletter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands.
\theoremstyle{definition}
\newtheorem{example}{\protect\examplename}
\theoremstyle{definition}
\newtheorem{xca}{\protect\exercisename}
\theoremstyle{definition}
\newtheorem{defn}{\protect\definitionname}
\theoremstyle{remark}
\newtheorem{rem}{\protect\remarkname}
\theoremstyle{plain}
\newtheorem{thm}{\protect\theoremname}
\theoremstyle{plain}
\newtheorem{lem}{\protect\lemmaname}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
\date{}
\usepackage{MnSymbol}
\newcommand{\btimes}{\mathbin{\rotatebox[origin=c]{36}{$\pentagram$}}}
\newcommand\bleh{%
\mathrel{\ooalign{\hss$\btimes$\hss\cr%
\kern0.025ex\raise-0.88ex\hbox{\scalebox{2.5}
{$\circ$}}}}}
\usepackage{graphicx}
\date{}
\renewcommand\qedsymbol{$\bleh$}
\renewcommand\labelenumi{(\roman{enumi})}
\renewcommand\theenumi\labelenumi
\DeclareMathOperator{\Ann}{Ann}
\DeclareMathOperator{\coker}{coker}
\DeclareMathOperator{\Spec}{Spec}
\DeclareMathOperator{\Hom}{Hom}
\DeclareMathOperator{\End}{End}
\DeclareMathOperator{\Supp}{Supp}
\DeclareMathOperator{\codim}{codim}
\DeclareMathOperator{\ch}{char}
\DeclareMathOperator{\Aut}{Aut}
\DeclareMathOperator{\Frob}{Frob}
\DeclareMathOperator{\Gal}{Gal}
\DeclareMathOperator{\GL}{GL}
\DeclareMathOperator{\Span}{Span}
\DeclareMathOperator{\sgn}{sgn}
\DeclareMathOperator{\tr}{tr}
\DeclareMathOperator{\Sym}{Sym}
\makeatother
\providecommand{\definitionname}{Definition}
\providecommand{\examplename}{Example}
\providecommand{\exercisename}{Exercise}
\providecommand{\lemmaname}{Lemma}
\providecommand{\remarkname}{Remark}
\providecommand{\theoremname}{Theorem}
\begin{document}
\input{preamble.tex}
\handout{CS 229r Information Theory in Computer Science}{Feb 10, 2020}{Instructor:
Madhu Sudan}{Scribe: Matthew Hase-Liu}{Lecture 5}
\global\long\def\wangle#1{\left\langle #1\right\rangle }
\global\long\def\ol#1{\overline{#1}}
\global\long\def\acts{\curvearrowright}
\global\long\def\ord#1#2{\text{ord}_{#1}(#2)}
\global\long\def\Id{\text{Id}}
\global\long\def\A{\mathbb{A}}
\global\long\def\R{\mathbb{R}}
\global\long\def\Q{\mathbb{Q}}
\global\long\def\N{\mathbb{N}}
\global\long\def\C{\mathbb{C}}
\global\long\def\P{\mathbb{P}}
\global\long\def\Z{\mathbb{Z}}
\global\long\def\mf#1{\mathfrak{#1}}
\global\long\def\ep{\varepsilon}
\global\long\def\vec#1{\overrightarrow{#1}}
\global\long\def\re#1{\text{Re}\,\left(#1\right)}
\global\long\def\im#1{\text{Im}\,\left(#1\right)}
\global\long\def\Div#1{\text{Div}\,\left(#1\right)}
\global\long\def\Res#1#2#3{\text{Res}_{#2}^{#3}\left(#1\right)}
\global\long\def\Ind#1#2#3{\text{Ind}_{#2}^{#3}\left(#1\right)}
Today, we look at the list-decoding bound, some musings on the limitations and tightness of our bounds, and the beginnings of "algebraic" coding theory, which we'll go into much more depth next time.
As a reminder, don't forget to sign up for office hours this and next week (in groups of two)! Also, Chi-Ning's office hours will be streamed.
\begin{section}{What we've done so far}
We've proved two upper bounds (respectively the Hamming bound and Plotkin bound) in previous classes: $R\le 1-H(\delta/2)$ by packing balls of radius $d/2$, which looks like a convex curve, and $R\le 1-2\delta$ by considering an embedding of $\{0,1\}^n$ into the Euclidean space $\mathbb{R}^n$. In particular, the codes corresponding to stuff above these curves are unattainable.
On the other hand, we've also shown a lower bound (i.e. codes that are attainable) by greedy and random methods (the Gilbert-Varshamov bounds): $R\ge 1-H(\delta)$.
\end{section}
\begin{section}{The Elias-Bassalygo bound}
\begin{center}
\includegraphics[width=0.5\textwidth]{graph}
\end{center}
The best (upper) bound we'll get in this course is called the Elias-Bassalygo bound: $$R\le 1-H\left(\frac{1}{2}(1-\sqrt{1-2\delta})\right)$$.
The approach combines techniques from both the Hamming and Plotkin bounds. In particular, recall that a Hamming code of distance $d$ corrects approximately $\delta/2$ fraction of errors uniquely. We similarly want to say that we can actually correct $\frac{1}{2}(1-\sqrt{1-2\delta})$ fraction of errors with \textbf{small lists}, which are lists of size at most $n^2$.
Recall that in the Hamming case, saying that we can uniquely decode $\delta/2$ fraction of errors means that for any two codewords, balls of radius $d/2$ around them will not intersect.
We modify this by saying that if we draw balls of radius $\frac{1}{2}(1-\sqrt{1-2\delta})$ centered at codewords in our space, although there may be intersections, there won't be any points that land in more than $n^2$ balls. To prove this, we'll consider an embedding into Euclidean space again.
Returning to the Hamming case, assuming the above is true, the analysis above yields the bound
$2^n\ge 2^k \textrm{vol}(n,d/2)$, from which we derive the Hamming bound.
\begin{exercise}
Assuming the unproven claim above, show that $R\le 1-H\left(\frac{1}{2}(1-\sqrt{1-2\delta})\right)$ by essentially copying the same idea from the Hamming case.
\end{exercise}
As a hint, you should get the bound $n^2 2^n \ge 2^k \textrm{vol}\left(n,\frac{n}{2}(1-\sqrt{1-2\delta})\right)$. Note that the left side is a worse bound, while the right side is better. Balancing this tension is precisely what gives us the improved bound.
We now prove the following lemma, which will fill in the details of our proof outline above.
\begin{lemma}
Let $C$ be a code with relative distance $\delta$, $\tau = \frac{1}{2}(1-\sqrt{1-2\delta})$, $t=\tau n$, and $d=\delta n$. Then, for all $w\in\{0,1\}^n$, there are at most $n^2$ codewords $c_1,\ldots, c_{n^2}\in C$ so that, for each $i$, $w$ is contained in a ball of radius $t$ centered at $v_i$.
\end{lemma}
\begin{proof}
We proceed by contradiction and assume $\tau$ isn't fixed (this will be made clearer in a bit). Suppose $C$, then, is not list-decodable from errors with list size $n^2$. Then, there is some vector $w\in\{0,1\}^n$, along with codewords $c_1,\ldots, c_{n^2+1}$ so that $\Delta(c_i,c_j)\ge d$ and $\Delta(c_i,w)\le t$. We want to show that, for the "correct" value of $\tau$ (i.e. we will ultimately show that $\tau = \frac{1}{2}(1-\sqrt{1-2\delta})$ is the "correct" value), this is, in fact, impossible.
Like in the Plotkin bound argument, we consider the following embedding in to Euclidean space: take $0\mapsto 1$ and $1 \mapsto -1$. Renormalizing and rephrasing everything, we currently have the following setup: vectors $v,u_1,\ldots,u_{n^2+1}\in \frac{1}{\sqrt{n}}\{-1,1\}^n$ so that $\wangle{u_i,u_i}=1=\wangle{v,v}$, $\wangle{u_i,u_j}\le 1-2\delta$, and $\wangle{u_i,v}\ge 1-2\tau$.
Intuitively, we want to reorient the origin so that the shifted vectors that are close to $v$ have large angles between each other (at least 90 degrees, for instance). Doing so forces the inner products of different shifted vectors to be at most 0, from which we can conclude, recalling our analysis of the Plotkin bound, that there can be at most $2n$ vectors (which is clearly less than $n^2+1$ for $n\ge 2$), which yields a contradiction.
Making this more formal, we let the new origin be a scaling of $v$, i.e. say $\alpha v$, for some $\alpha\in \R$. Then, we can write out the shifted vectors as $\tilde{v_i}=v_i-\alpha v$. The inner product between the shifted vectors $\tilde{v_i}, \tilde{v_j}$ is then given by $\wangle{\tilde{v_i},\tilde{v_j}}=\wangle{v_i,v_j}-\alpha\wangle{v_i,v}-\alpha\wangle{v_j,v}+\alpha^2\wangle{v_i,v_j}\le 1-2\delta -2\alpha(1-2\tau)+\alpha^2$.
We can minimize the quantity on the right by taking $\alpha$ to be $1-2\tau$. By manipulating the terms, it's clear that taking $\tau\ge \frac{1}{2}(1-\sqrt{1-2\delta})$ then forces all the inner products to be nonpositive, and then we're done.
\end{proof}
As might be expected, $n^2$ isn't anything special in the proof above. In fact, we can take the length of the list to be any polynomial in $n$:
\begin{exercise}
Show that everything above works out if we take the list size to be polynomial in $n$.
\end{exercise}
In the following section, we analyze how the Elias-Bassalygo bound compares to the other bounds from earlier.
\end{section}
\begin{section}{Comparison of the Elias-Bassalygo bound with other bounds}
First, we note that because $H(\cdot)$ is monotone in $\delta/2\le \tau\le \delta$, it follows that the Elias-Bassalygo bound lies in between the Gilbert-Varshamov and Hamming bounds. There is, in fact, another bound called the McEliece-Rodemich-Rumsey-Welch (also JPL or LP) bound \cite{LP} that is slightly better than the Elias-Bassalygo bound, but we will not cover in this course.
We note that as $\delta\to1/2$, we have $\tau\to1/2$, and similarly as $\delta\to0$, we have $\tau\to0$. Near 0, we have $H(\delta)\to\delta\log(1/\delta)$. Both the Elias-Bassalygo bound and the Hamming bound essentially tell us that, in this area, $R\le1-\frac{\delta}{2}\log\frac{2}{\delta}\approx 1-\frac{\delta}{2}\log\frac{1}{\delta}$. From the Gilbert-Varshamov bound (recall the greedy and random constructions), we also have $R\ge1-\delta \log\frac{1}{\delta}$.
In the other extreme, the Elias-Bassalygo bound comes close to the Gilbert-Varshamov bound. If we write $\delta=1/2-\varepsilon$, the Gilbert-Varshamov bound gives $R\ge\Omega(\varepsilon^2)$, whereas list-decoding gives us $R\le1-H\left(\frac{1}{2}(1-\sqrt{1-2\delta})\right)=O(\varepsilon)$. It turns out that best we can do comes from the LP bound, which gives us $R=O\left(\varepsilon^2\log^2\frac{1}{\varepsilon}\right)$---the point is that the right answer should look something like $\varepsilon^2$ instead of $\varepsilon$.
\end{section}
\newline
At this point we took a quick break and did the following exercise as a class:
\begin{exercise}
Introduce yourself. What is your name, grade, and concentration? And why are you taking this class?
\end{exercise}
\begin{section}{Beginnings of the algebraic theory}
So far, we haven't seen too many concrete things that are useful for explicit, algorithmic purposes---perhaps the only thing we've encountered is Gilbert's greedy construction.
When talking about the algebraic theory, we start to care about the structures that can be put on $[n]$. Some examples include $\mathbb{F}_q,\mathbb{F}_q^m$, and subsets of $\mathbb{F}_q^m$, where $\mathbb{F}_q$ is the\footnote{We can say \textbf{the} finite field of a certain size because all finite fields of the same size are isomorphic up to isomorphism! In particular, the finite field of size $q=p^n$, with $p$ prime, is the splitting field of the polynomial $x^q-x$ over $\mathbb{F}_p.$} finite field of $q$ elements.
Let $S\subset \mathbb{F}_q^m$. We can consider codes $C$ that are linear subspaces of the set of functions $\{f:S\to\mathbb{F}_q\}$, by which we mean that $f,g\in C,\alpha\in\mathbb{F}_q\implies\alpha f+g\in C$.
As we'll see in the future, we want $C$ to comprise functions who have relatively few zeroes in $S$. In the simplest case, we can let $S=\mathbb{F}_q$ and take polynomial functions of bounded degree. This leads to the construction of Reed-Solomon codes.
\begin{example}[Reed-Solomon codes]
Let $C_k=\{\textrm{polynomials of deg at most $k-1$ with coefficients in $\mathbb{F}_q$}\}$. We then take $\alpha_1,\ldots,\alpha_n$ to be $n$ distinct elements in $\mathbb{F}_q$, and define a function $E:\mathbb{F}_q^k\to\mathbb{F}_q^n$ that takes as input some element $m\in\mathbb{F}_q^k$ that corresponds to the coefficients of a polynomial of degree at most $k-1$ (note that such a polynomial has at most $k$ nonzero coefficients) in, say, increasing order (represent this polynomial as $f$), and then output $[f(\alpha_1),\ldots, f(\alpha_n)]$.
Noting that the number of roots of a nonzero polynomial of degree $k-1$ is at most $k-1$, we immediately get $d\ge n-k+1$.
\end{example}
This means the Reed-Solomon code reaches the Singleton bound!
\begin{exercise}
In the proof above, we used the fact that the number of roots of a nonzero polynomial of degree $k$ is at most $k$. Why is this true? This is a little subtle, because the polynomial $x^2-1$, for instance, actually has more than two roots modulo 8!
\end{exercise}
\end{section}
\begin{thebibliography}{}
\bibitem{LP}
J. McEliece, E. R. Rodemich, H. Rumsey, and L. R. Welch. ”New Upper Bounds on the Rate of a Code via the Delsarte-MacWilliams Inequalities.” IEEE Trans. Inform. Theory, vol.IT-23, pp. 157–166, Mar. 1997.
\end{thebibliography}
\end{document}