CS 284r: (Social) Data Mining

SEAS

Instructor: Yaron Singer
Time: M./W. 10-11:30 am
Room: Pierce 100F
Email:yaron at seas dot harvard dot edu

Staff: We're fortunate to have Anudhyan Boral (anudhyan at seas dot harvard dot edu), as our teaching fellow this semester.

Overview: This is a rotating topics course on computation in networks and crowds. This semester we will focus on algorithms for mining large-scale social network data sets. The availability of such data at massive scale provides a unique system-wide perspective on collective human behavior. Analyzing social network data sets involves complex optimization techniques where the underlying network structure plays an important role and requires overcoming various limitations imposed by large-scale, and often noisy, data. In this course we will cover advanced methods for such tasks, largely by studying recent research papers.

Prerequisites: This will be a mathematically rigorous course. Prerequisites include CS124, and importantly mathematical maturity. There may be some programming exercises. The course is intended for graduate students, but advanced undergraduates are encouraged to attend as well.

Goals: The goal of this course is to expose students to advanced methods in data mining algorithms. To this end, the course will mainly cover recent papers, and encourage research in this area.

Assessment: The main component of this course will be a research project. This project can be data-oriented or theoretically-focused, or (better) a combination of both. Each student will also present a paper or two, scribe lectures, read papers and submit comments, and there will be a couple of problem sets. The final grade in the class will roughly breakdown as: participation and comments 25%, problem sets 25%, presentation of a research paper 15%, project 35%, arithmetic skills 10%.

SEAS