\documentclass[11pt]{article}
\input{../preamble.tex}
\usepackage{amsmath}
\newcommand{\Log}{{\rm log}}
\newcommand{\ep}{\epsilon}
\newcommand{\de}{\delta}
\newcommand{\bscp}{\mbox{BSC}_p}
\newcommand{\DD}{\mathcal{D}}
\newcommand{\Ex}{\mbox{Exp}}
\newcommand{\Ball}{\mbox{Ball}}
\begin{document}
\lecture{20}{November 25, 2002}{Madhu Sudan}{Amit J. Deshpande}
%--- overview --
\section{Overview}
In this lecture, we will see some complexity results for coding
problems - known hardness results and some open questions.
\begin{itemize}
\item Hardness of the Nearest Codeword Problem (NCP)
\item Approximation variants
\item Decoding with preprocessing
\item Decoding with relatively near codeword
\item Minimum distance problem
\end{itemize}
%--- nearest codeword problem --
\section{Nearest Codeword Problem}
The problem of finding out the nearest codeword (or maximum likelihood
decoding) to a given received vector has been of crucial importance in
the theory of error-correcting codes. Since in the general case, where
the code is described by an encoding circuit, the problem of finiding
a message corresponding to a given codeword is already hard so we
might as well restrict our attention only to the linear codes. So
given the code by its generator matrix and a received vector, find out
a codeword nearest to it. We will formalize this as follows -
\begin{definition}{\bf (Nearest Codeword Problem - NCP)} Given a code
with generator matrix $G$ and received vector $r$, find $x$ that
minimizes $\Delta(xG, r)$.
\end{definition}
How hard is it to solve NCP ? We will show that NCP is hard even for
the special case when $r= \bar{1}$. This is done by a reduction from
Max Cut (which is a well-known NP-hard problem).
\begin{definition}{\bf (Max Cut Problem)} Given graph $H=(V,E)$ find
$S \subseteq V$ such that, the number of edges between $S$ and
$\bar{S}$ is maximum.
\end{definition}
The reduction goes as follows - Let $G$ be the incidence matrix of a
graph $H=(V,E)$ with $|V|=k$ and $|E|=n$. So our message $x$
corresponds to the subset of $V$ specified by the $1$'s in it and
codewords correspond to those edges $e$ which give $1$ after
multiplication by $x$. i.e. both the $1$'s in $e$ cannot be in $S$ or
$\bar{S}$. So $e$ must be a crossing edge. Thus the codewords
correspond to cuts and finding max cut is equivalent to finding the
maximum weight codeword (meaning, nearest to $\bar{1}$).
%--- approximation variants of NCP --
\section{Approximation variants of NCP}
There are three important variants of the approximation problems. For
a given instance $(G,r)$ we will define $\tau =
\min_{x}\{\Delta(xG,r)\}$, and $\alpha > 1$ be our approximation
parameter.
\begin{definition}{\bf (Search question):} Find $x'$ such that $\tau
\leq \Delta(x'G,r) \leq \alpha \cdot \tau$.
\end{definition}
\begin{definition}{\bf (Estimation question):} Estimate $t$ such that
$\tau \leq t \leq \alpha \cdot \tau$.
\end{definition}
\begin{definition}{\bf (Gap decision problem):}{\em (``promise''
problem)} Given $(G,r,t)$ with the promise that $\tau \notin [t,
\alpha t]$ decide if $\tau \leq t$ or not.
\end{definition}
And it's easy to observe that a solution to search problem gives a
solution to estimation problem, and a solution to estimation problem
gives a solution to Gap decision problem. Also as $\alpha$ becomes
closer and closer to $1$ the problems get harder. Analogous
definitions can be made for the maximization versions of these
problems.
%--- hardness of approximating NCP --
\section{Hardness of approximating NCP}
A critical question would be - is it hard even to find an
approximately nearest codeword ?
We know that Max Cut is hard to approximate to within some $\alpha <
1$. So we can use this fact to show the hardness for $NCP$. Elementary
probability (first moment method) gives that every graph has at least
a cut of size $|E|/2$, where $|E|$ is the number of edges. And the
reduction that we used for showing the NCP is NP-hard says that
finding a Max Cut of size $x$ corresponds to getting a codeword of
weight $x$. i.e. a codeword within distance $n-x$ from $\bar{1}$. But
since we know that $x \geq n/2$. This alongwith an
$\beta$-approximation to NCP within $n-x$, $n-x \leq n-x' \leq \beta
(n-x)$, gives that $\frac{1}{(2 - \beta)} x \leq x' \leq x$. And thus a
$\alpha = 1/(2-\beta)$-approximation to Max Cut. And $\alpha
\rightarrow 1$ as $\beta \rightarrow 1$. But we already know that Max
Cut cannot be approximated within $\alpha < 1$ for some $\alpha$,
which implies the corresponding hardness result for NCP as -
\begin{theorem}
NCP is hard to approximate to within some $\beta > 1$.
\end{theorem}
Moreover, we can prove something stronger as this problem has a
self-improving property.
\begin{theorem}
$\beta$-approximation to NCP is hard implies that
$\beta^2$-approximation is also hard. And using this repetitively we
get, any constant approximation to NCP is hard.
\end{theorem}
\begin{proof}
The proof involves a clever construction - given $G$ generator matrix
of a code of length $n$, we can construct a ``product'' $G^{(2)}$
generator matrix of a code of length $n^2$ such that $G$ has a
codeword of weight $n-w$ iff $G^{(2)}$ has a codeword of weight $n^2 -
w^2$.
A codewords of $G^{(2)}$ is an $n \times n$ matrix with columns
labelled by a codeword of $G$. Each column is a codeword of $G$ or its
complement according to the label $0$ or $1$, respectively. To our
surprise, this happens to be a linear code.
So if $G$ has a code of weight $n-w$ then we can cosider the codeword
in $G^{(2)}$ that has $\bar{1}$ in all the columns labelled by $1$'s
and the $n-w$ weight code in $G$ in all the columns labelled by
$0$'s. And the labelling also corresponds to the $n-w$ weight codeword
of $G$. This gives a codeword of in $G^{(2)}$ of weight $n^2 - w^2$.
And that's the maximum you can do to stuff your matrix with more and
more $1$'s.
This clearly implies that if there is a $\beta^2$-approximation
algorithm for the code $G^{(2)}$ then it should give a
$\beta$-approximation for code $G$. And thus $\beta$-approximation
hardness for $G$ translates into $\beta^2$-approximation hardness,
too.
\end{proof}
%--- Criticism --
\section{Criticism}
There has been a lot of criticism on this which gives rise to the
following problems -
\begin{itemize}
\item Code shouldn't be part of the input and we should be given a lot
of preprocessing time to devise the decoding algorithm.
\item How do these results relate to the error-correction property ?
To make sense, we should be trying to correct less errors than the
minimum distance of the code.
\item The codes we saw here had a very low-density generator matrix as
it was corresponding to the incidence matrix of a graph. But we want
hardness results for better codes. e.g. Reed-Solomon codes,
algebraic geometry codes, LDPC codes, Turbo codes (any of your
favourite codes).
\end{itemize}
We will analyze some results that try to address these questions.
%--- Answers to the critisism --
\section{More hardness results addressing the criticism}
\subsection{Hardness of decoding a fixed family of codes [Bruck-Naor]}
The first criticism regarding sparse generator matrix was addressed by
Bruck-Naor \cite{BN} and the idea was to ``inject'' the generator of
the code into received vector, while fixing the code. Let $G$ be the
incidence matrix of a graph. For every pair of vertices $(u,v)$, have
twin-pair of columns. So such a code $C$ has a generator matrix with
$2 \left( \begin{array}{c} k \\ a \end{array} \right)$ columns. Now
suppose that we have code $B$ and received vector $r$ as an instance
of NCP. Construct a new received vector as follows: if edge $(u,v)$ is
in $G$ then duplicate the entry of $r$ in the corresponding coordinate
of $r'$, and otherwise put $0,1$.
Now note that, $\Delta (xC,r') = N/2 - n - 2 \Delta (xB,r)$ where $N$
and $n$ are the block lengths of $C$ and $B$, respectively. So the
minimum distances are related and we cannot compute NCP exactly for
the code $C$.
This method also works when the generator matrix is $a$-sparse (in
fact, more generally). Hardness of approximating in this setting is
studied in Feige-Micciancio \cite{FM}.
\subsection{Decoding codes upto min distance [Dumer-Micciancio-Sudan]}
This addressed the other criticism regarding hardness results for
asymptotically good codes. Dumer-Micciancio-Sudan \cite{DMS} show that
we can ``boost'' the distance of the code without altering the problem
too much. This was shown by showing a hardness result for a version of
Gap Decision Problem for the minimum distance.
Suppose that finding the nearest codeword to code generated by $A$ is
hard to approximate (to within factor of 100, say). Then we
specifically have $A,r,d$ such that telling if $\tau > d$ or $\tau
\leq d/100$ with high probability is hard. The trick is to attach to
$A$ a generator matrix $B$ of a code of distance $d$, and getting an
appropriate $r'$.
Dumer-Micciancio-Sudan \cite{DMS} show that decoding codes of minimum
distance $d$ for upto less than $d$ errors is NP-hard.
%--- minimum distance problem (MADHU SKIPPED IN THE CLASS) --
%\section{Computing minimum distance}
%{\sf
%\begin{quote}
%Madhu skipped this part in the class ...
%\end{quote}
%}
%--- open questions --
\section{Open questions}
All these still raise a few more open questions -
\begin{itemize}
\item Can you solve NCP is polytime for some asymptotically good
family of codes ? Reed-Solomon ? or your favourite code ?
\item Does there exist a single decoding algorithm decoding all codes
upto half the minimum distance ?
\item Does there exist an algorithm giving a lower bound for minimum
distance which guarantees that if the relative distance is
$1-\frac{1}{q}-\epsilon$ then the lower bound given by the algorithm
is at least $1-\frac{1}{q}-\epsilon^2$ ?
\end{itemize}
%--- bibliography --
\begin{thebibliography}
\small
\bibitem[1]{BN}
J.Bruck, M.Naor, {\em The hardness of decoding linear code with
preprocessing}, IEEE Transaction on Information Theory, pp. 381-385,
May 1990.
\bibitem[2]{FM}
U.Feige, D.Micciancio, {\em The inapproximability of lattice and
coding problems with preprocessing}, Proceedings of IEEE Conference
on Computational Complexity, pp. 44-52, 2002.
\bibitem[3]{DMS}
I.Dumer, D.Micciancio, M.Sudan, {\em Hardness of approximating the
minimum distance of a linear code}, IEEE Transaction on Information
Theory, Jan 2003 (to appear).
\end{thebibliography}
\end{document}