33 Oxford Street, Cambridge, MA 02138
Phone: 617-496-1905 (O)
New: I am on temporary leave to Google’s Advanced Technology and Projects (ATAP) group.
New: I co-chair the First International Workshop on Visual Domain Adaptation and Dataset Bias (VisDA) with Brian Kulis, Kate Saenko, and Fei Sha.
Ruonan Li received the B.Eng. and M. S. degrees with honors in electrical engineering from Tsinghua University in 2004 and 2006 respectively. He received the Ph.D. degree in electrical engineering from the University of Maryland in 2011 advised by Rama Chellappa. He joined the School of Engineering and Applied Sciences, Harvard University, as a Postdoctoral Fellow in 2011 under the direction of Todd Zickler, and was appointed Research Associate in 2012. He is a member of the Graphics, Vision and Interaction Group. His research is focused on understanding deep semantics in large-scale image data that emerges from and evolves with new acquisition and sharing modalities. His work is motivated by applications in object and behavior recognition; video analysis and spatio-temporal modeling; semi-supervised and unsupervised learning; and social signal processing.
Research and Publications
Ruonan Li's' research is focused on understanding deep semantics in large-scale image data that emerges from and evolves with new acquisition and sharing modalities. His work currenlty include three major aspects: (1) Enabling large-scale inference on unannotated imagery; (2) Discovering and representing novel semantics; and (3) Developing tools for optimization and statistics in novel spaces.
Recognition in the Presence of Data-Seletion Bias and Annotation Poverty
The lack of annotation prevents visual recognition models to be diretly optimized on the massive emerging imagery that exhibits different statistical properties from the existing annotated datasets. I use Visul Domain Adaptation approach to adapt visual recognition models to massive imagery without requesting additional supervision. My following work on a new representation for domain shifts is the frist piece to enable adapting object recognition models to a completely unannotated new dataset.
A discriminatively optimized alternative version of the domain-shift representation succeeds on another challenging task of recognizing huamn actions across viewpoints reported in the following paper.
Discovering and Inferring Novel Semantics
In exploring the many types of deep semantics in the massive imagery that remain "hidden" from today's systems, I have begun establishing foundations of social visual analytics,a means of "socially-aware" visual processing that can infer social semantics from imagery. To this end, I developed the first spatio-temporal social interaction detector to match similar social interactions occurring in large social groups.
In a companion effort on sport games, salient collaborative and reactive motions are distinguished from dynamics of large crowds.
Optimization and Statistics for Visual Representations
Novel imaging modalities, data formats, and semantic structures in imagery often lead to novel types of mathematical spaces (e.g., manifolds and discrete structures (graphs)) that are hard for analyze using existing optimization techniques. To address this, I have contributed state-of-art solutions to optimization problems formulated in a special type of such spaces -- Riemannian manifolds. I proposed the first randomized optimization algorithm on the Cartesian product of spatial and temporal deformations for aligning videos or other spatio-temporal signals.
I also developed a functional optimization procedure to learn parametric probablistic models on the hyper-surface of normalized visual features.
A summary on recent advances in differntial geometric methods for pattern recognition and computer vision applications by my collaborators and myself can found below.
Last update: 12/2014