\documentclass[10pt]{article} \usepackage{amsfonts,amsthm,amsmath,amssymb} \usepackage{array} \usepackage{epsfig} \usepackage{fullpage} \usepackage{amssymb} \usepackage[colorlinks = false]{hyperref} \newcommand{\1}{\mathbbm{1}} \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \newcommand{\x}{\times} \newcommand{\Z}{\mathbb{Z}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\R}{\mathbb{R}} \newcommand{\N}{\mathbb{N}} \newcommand{\F}{\mathbb{F}} \newcommand{\E}{\mathop{\mathbb{E}}} \renewcommand{\bar}{\overline} \renewcommand{\epsilon}{\varepsilon} \newcommand{\eps}{\varepsilon} \newcommand{\DTIME}{\textbf{DTIME}} \renewcommand{\P}{\textbf{P}} \newcommand{\SPACE}{\textbf{SPACE}} \begin{document} \input{preamble.tex} \newtheorem{example}[theorem]{Example} \theoremstyle{definition} \newtheorem{defn}[theorem]{Definition} \handout{CS 229r Information Theory in Computer Science}{February 28, 2019}{Instructor: Madhu Sudan}{Scribe: Alec Sun}{Lecture 10} \section{The Road Map} Today we will: \begin{itemize} \item Wrap up polar codes. \item Start communication complexity and give some: \begin{itemize} \item Basic definitions \item Examples \item Lower bounds \end{itemize} \end{itemize} \section{Summary of Polar Codes} The essential objects in the analysis of polar codes are martingales. Let $x_0,\ldots,x_t$ be a $[0,1]$-bounded Martingale. There are two ways to characterize the polarization: \begin{itemize} \item \textbf{Local Polarization.} There is variance in the middle, namely $\forall \tau, \exists \eps$ such that $\forall i,$ $x_{i-1}\in (\tau,1-\tau)$ then $\text{Var}(X_i\mid X_{i-1})\ge \sigma^2.$ There is also suction at the ends, namely $\exists \theta>0$ such that $\forall c, \exists \tau$ such that $\forall i,$ if $x_{i-1}\le \tau$ then $$\text{Pr}[x_i < x_{i-1}/c] \ge \theta.$$ In our martingale we had $\theta = 1/2.$ \item \textbf{Strong Polarization.} If $\forall c, \exists \beta<1$ such that $\forall t,$ $$\text{Pr}[ x_t\in (c^{-t}, 1-c^{-t})] = O(\beta^t)$$ where the $O(\cdot)$ is respect to $t.$ \end{itemize} A theorem that we did not prove is the implication of strong polarization from local polarization. This theorem is nice because we no longer need to consider the long-term behavior in the analysis and can instead focus on a single step. What we were doing is looking at a tree-like process. The entropy $x_{i-1}$ is a conditional entropy on the variables above it. Why should a process like this show local polarization effects? Label a node $Z_a^{i-1}$ that branches out to $Z_b^i, Z_c^1.$ Then we are looking at \begin{align*} H(Z_a^{i-1}\mid Z_{ca}^{i-1}) \\ H(U\mid A) \\ H(V\mid B) \\ H(U+V\mid A,B) \\ H(V\mid A,B,U+V) \end{align*} Here $U$ corresponds to one node conditioned on nodes above it, and $V$ corresponds to a different node conditioned on a different set of nodes above it. \begin{center} \vspace{0.2in} \includegraphics[scale = 0.4]{polar-tree.png} \vspace{0.2in} \end{center} To get a rough sense of how entropy behaves, we can drop the conditional part of entropy and consider the following question: How do $H(U), H(V)$ compare with $H(U+V)$ and $H(V\mid U+V)$? If $U,V$ are i.i.d. Bernoulli variables, what does $U+V$ look like? Does it have variance in the middle? Does it have suction at the ends? \section{Polarization of Bernoulli Random Variable} \begin{center} \vspace{0.2in} \includegraphics[scale=0.4]{uv.png} \vspace{0.2in} \end{center} Let \$0