CS 225: Pseudorandomness
Spring 2004

SYLLABUS

Course meetings: Tue-Thu 1-2:30, Pierce Hall 209 (29 Oxford Street)

Lecturer: Salil Vadhan
Office: Maxwell-Dworkin 337
Shopping period office hours: Thu 2/5 2:30-3:30 PM, Fri 2/6 2-5 PM (will be away Mon 2/9-Wed 2/11)
Tentative office hours starting 2/10: Tue 4-5, Thu 10-11, or by appointment

Teaching Fellow: Minh Nguyen
Office: Maxwell-Dworkin 138
Tentative office hours: Fri 4:30-6:30

E-mail address for questions: cs225@eecs.harvard.edu
E-mail address for submitting homeworks: cs225-hw@eecs.harvard.edu
Course website: http://www.courses.fas.harvard.edu/~cs225/

Course Description

Over the past few decades, randomization has become one of the most pervasive paradigms in computer science. Its widespread use includes:

Algorithm Design: For a number of important algorithmic problems (including problems in algebra, statistical physics, and approximate counting), the only efficient algorithms known are randomized.

Cryptography: Randomness is woven into the very way we define security.

Combinatorial Constructions: Many useful combinatorial objects, such as error-correcting codes and expander graphs (see below), can be constructed simply by generating them at random.

Interactive Proofs: Randomization, together with interactive communication, can also add dramatic efficiency improvements and novel properties (such as "zero knowledge") to classical "written" mathematical proofs.

So randomness appears to be extremely useful in these settings, but we still do not know to what extent it is really necessary. Thus, in this course we will ask:

Main Question: Can we reduce or even eliminate the need for randomness in the above settings?

Why do we want to do this? First, essentially all of the applications of randomness assume we have a source of perfect randomness one that gives "coin tosses" that are completely unbiased and independent of each other. It is unclear whether physical sources of perfect randomness exist and are inexpensive to access. Second, randomized constructions of objects such as error-correcting codes and expander graphs often do not provide us with efficient algorithms for using them; indeed, even writing down a description of a randomly selected object can be infeasible. Finally, and most fundamentally, our understanding of computation would be incomplete without understanding the power that randomness provides.

In this course, we will address the Main Question via a powerful paradigm known as pseudorandomness. This is the theory of efficiently generating objects that "look random", despite being constructed using little or no randomness. Specifically, we will study several kinds of "pseudorandom" objects, such as:

Pseudorandom Generators: These are procedures which stretch a short "seed" of truly random bits into a long string of "pseudorandom" bits which cannot be distinguished from truly random by any efficient algorithm. They can be used to reduce and even eliminate the randomness used by any efficient algorithm. They are also a fundamental tool in cryptography.

Randomness Extractors: These are procedures which extract almost uniformly distributed bits from sources of biased and correlated bits. Their original motivation was to allow us to use randomized algorithms even with imperfect physical sources of randomness, but they have also turned out to have a wide variety of other applications.

Expander Graphs: These are graphs which are sparse but nevertheless highly connected. They have been used to address many fundamental problems in computer science, on topics such as network design, complexity theory, coding theory, cryptography, and computational group theory.

Error-Correcting Codes: These are methods for encoding messages so that even if many of the symbols are corrupted, the original message can still be decoded. We will focus on "list decoding", where there are so many corruptions that uniquely decoding the original message is impossible, but it is still possible to produce a short list of possible candidates.

Each of the above objects has been the center of a large and beautiful body of research, and until recently these corpora were largely distinct. An exciting recent development has been the realization that all four of these objects are almost the same when interpreted appropriately. Their intimate connections will be a major focus of the course, tying together the variety of constructions and applications of these objects we will cover.

The course will reach the cutting-edge of current research in this area, covering some results from within the last year. At the same time, the concepts we will cover are general and useful enough that hopefully anyone with an interest in the theory of computation or combinatorics could find the material appealing.

Some Possible Topics

Randomized Algorithms:

examples
complexity classes (BPP, RP, RL,...)
basic properties, e.g. error reduction

Basic tools & notions:

tail inequalities
pairwise independence, almost k-wise independence
method of conditional expectations
entropy measures

Pseudorandom generators (Blum-Micali-Yao defn):

computational indistinguishability and its properties
hardcore bits
construction from one-way permutations
pseudorandom functions
applications: cryptography, learning, complexity

Pseudorandom generators (Nisan-Wigderson defn):

construction from hard Boolean functions ("nearly disjoint subsets" generator)
evidence that P=BPP
derandomizing constant-depth circuits
construction based on multivariate polynomials (after extractors)

List-decodable error-correcting codes:

constructions based on multivariate polynomials
application to worst-case vs. avg-case complexity

Expander Graphs

measures of expansion: vertex expansion, 2nd eigenvalue...
probabilistic existence
construction based on zig-zag graph product
applications

Extractors:

weak random sources
the Leftover Hash Lemma
connection to and construction from pseudorandom generators
connection to list-decodable codes & construction from multivariate polynomials
connection to expander graphs
applications

Randomness Conductors:

construction from zig-zag product
expansion close to the degree

Derandomization vs. uniform complexity and vs. nonuniform complexity:

BPP vs. EXP, MA vs. NEXP
the "easy witness" method
some unconditional derandomizations
are circuit lower bounds necessary for derandomization?

Pseudorandom generators for space-bounded computation:

constructions from hashing, extractors
RL vs. L
applications: universal traversal sequences, data stream computations

Algebraic pseudorandomness:
- small-bias spaces
- Cayley expander graphs
- linear error correcting codes

Prerequisites

This is an advanced graduate course, so I will be assuming that you have general "mathematical maturity" and a good undergraduate background in the theory of computation. One concrete guideline is that you should have had a minimum of two other courses in the theory of computation, including at least one graduate course. If you have particularly strong math background, then there can be a bit more flexibility with this. But if you haven't had a prior graduate course in the theory of computation (numbered CS 22x at Harvard), you must come speak to me at office hours before registering for the class.

In terms of topics, I will be assuming familiarity with the following. In all cases (especially complexity theory), the more background you have, the better.

Complexity Theory: P, NP, NP-completeness, reductions (as in CS 121).

Randomized Algorithms: You should have seen several examples of randomized algorithms, as in CS 124, 223, or 224.

Algebra: The basics of groups, (finite) fields, vector spaces, eigenvectors/eigenvalues. Any of CS 224, Math 122-123, AM 106 should be sufficient.

Other: Basic discrete probability, graph theory & combinatorics.

Grading & Problem Sets

The requirements of the course:

Biweekly problem sets
Taking "scribe notes" for 1-3 lectures (depending on how many students are in the class), and typing them up in LaTeX.
Possible take-home final exam or in-class presentations (depending on size/composition of class).

The biweekly problem sets will typically be due on Mondays by 5 PM. Your problem set solutions must be typed and submitted electronically to cs225-hw@eecs.harvard.edu. You are allowed 12 late days for the semester, of which at most 7 can be used on any individual problem set. (1 late day = 24 hours exactly).

The problem sets will be challenging, so be sure to start them early. You are encouraged to discuss the course material and the homework problems with each other in small groups (2-3 people). Discussion of homework problems may include brainstorming and verbally walking through possible solutions, but should not include one person telling the others how to solve the problem. In addition, each person must write up their solutions independently, and these write-ups should not be checked against each other or passed around.

Readings

There is no required text for the course. Indeed, much of the material we will be covering is not written in any textbook. However, you may find the following references useful. Most of them should be in the libraries, on reserve.

Oded Goldreich. Modern Cryptography, Probabilistic Proofs, and Pseudorandomness.This book contains an excellent overview of the theory of pseudorandomness (as of a couple of years ago). It is written survey-style, so it doesn't contain many details, but it is great as a starting point and can lead to relevant papers. I've ordered it at the Coop.

Peter Bro Miltersen. Derandomizing Complexity Classes. A book chapter to appear in the Handbook of Randomization, currently on available online. A nice detailed presentation of many of the objects we will be studying. A good place for an alternative viewpoint to the lectures.

Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms. For more background on and examples of randomized algorithms.

Michael Sipser. Introduction to the Theory of Computation. Contains all of the complexity theory background you'll need in this course.

Christos Papadimitriou. Computational Complexity. Also good for more background on complexity, more advanced than Sipser's text.

Noga Alon and Joel Spencer. The Probabilistic Method (2nd ed). Describes the usefulness of randomness in combinatorics, with some discussion of pseudorandomness and derandomization.

Oded Goldreich. Foundations of Cryptography (Vol I). Covers the BMY definition of pseudorandom generators, pseudorandom functions, and their applications in cryptography.

Michael Luby. Pseudorandomness and Cryptographic Applications. Covers the construction of pseudorandom generators from any one-way function, which is beyond the scope of our course and Goldreich's book.

Algebra: Two recommendations are some lecture notes by Madhu Sudan at http://theory.lcs.mit.edu/~madhu/FT01/scribe/algebra.ps and the text by Rudolf Lidl and Harold Niedereitter, Introduction to Finite Fields and their Applications. There are many other texts on algebra, though not all have much coverage of finite fields.

Related Courses This Term

CS 221: Computational Complexity. An excellent course to take in parallel.

CS 124: Data Structures & Algorithms.

AM 107: Graph Theory & Combinatorics.

MIT 6.875/18.875: Cryptography and Cryptanalysis.
MIT 6.897 Selected Topics in Cryptography.
Plus many other theory of computation courses at MIT.