Ruonan Li

Research Associate

School of Engineering and Applied Sciences

Harvard University


33 Oxford Street, Cambridge, MA 02138

Phone: 617-495-4478 (O)

Email:

     

New: I co-chair the First International Workshop on Visual Domain Adaptation and Dataset Bias (VisDA) with Brian Kulis, Kate Saenko, and Fei Sha.

Biography

Ruonan Li received the B.Eng. and M. S. degrees with honors in electrical engineering from Tsinghua University in 2004 and 2006 respectively. He received the Ph.D. degree in electrical engineering from the University of Maryland in 2011 advised by Rama Chellappa. He joined the School of Engineering and Applied Sciences, Harvard University, as a Postdoctoral Fellow in 2011 under the direction of Todd Zickler, and was appointed Research Associate in 2012. He is a member of the Graphics, Vision and Interaction Group. His research is focused on understanding deep semantics in large-scale image data that emerges from and evolves with new acquisition and sharing modalities. His work is motivated by applications in object and behavior recognition; video analysis and spatio-temporal modeling; semi-supervised and unsupervised learning; and social signal processing.

Research and Publications

Ruonan Li's' research is focused on understanding deep semantics in large-scale image data that emerges from and evolves with new acquisition and sharing modalities. His work currenlty include three major aspects: (1) Enabling large-scale inference on unannotated imagery; (2) Discovering and representing novel semantics; and (3) Developing tools for optimization and statistics in novel spaces.

Recognition in the Presence of Data-Seletion Bias and Annotation Poverty

The lack of annotation prevents visual recognition models to be diretly optimized on the massive emerging imagery that exhibits different statistical properties from the existing annotated datasets. I use Visul Domain Adaptation approach to adapt visual recognition models to massive imagery without requesting additional supervision. My following work on a new representation for domain shifts is the frist piece to enable adapting object recognition models to a completely unannotated new dataset.

R. Gopalan, R. Li, and R. Chellappa, Unsupervised Adaptation Across Domain Shifts By Generating Intermediate Data Representations. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2014, to appear. [PDF]

R. Gopalan, R. Li, and R. Chellappa, Domain Adaptation for Object Recognition: An Unsupervised Approach. IEEE International Conference on Computer Vision (ICCV), 2011 (Oral). [PDF][Code]

A discriminatively optimized alternative version of the domain-shift representation succeeds on another challenging task of recognizing huamn actions across viewpoints reported in the following paper.

R. Li and T. Zickler, Discriminative Virtual Views for Cross-View Action Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012 (Oral). [PDF][Slides][Code]

Discovering and Inferring Novel Semantics

In exploring the many types of deep semantics in the massive imagery that remain "hidden" from today's systems, I have begun establishing foundations of social visual analytics,a means of "socially-aware" visual processing that can infer social semantics from imagery. To this end, I developed the first spatio-temporal social interaction detector to match similar social interactions occurring in large social groups.

R. Li, P. Porfilio, and T. Zickler, Finding Group Interactions in Social Clutter. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013. [PDF][Dataset Page]

R. Li, P. Porfilio, and T. Zickler, Finding Group Interactions in Social Clutter. Technical Report, Computer Science Group, Harvard University TR-01-13, 2013. [PDF]

In a companion effort on sport games, salient collaborative and reactive motions are distinguished from dynamics of large crowds.

R. Li and R. Chellappa, Group motion segmentation using a spatio-temporal driving force model. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010. [PDF]

Optimization and Statistics for Visual Representations

Novel imaging modalities, data formats, and semantic structures in imagery often lead to novel types of mathematical spaces (e.g., manifolds and discrete structures (graphs)) that are hard for analyze using existing optimization techniques. To address this, I have contributed state-of-art solutions to optimization problems formulated in a special type of such spaces -- Riemannian manifolds. I proposed the first randomized optimization algorithm on the Cartesian product of spatial and temporal deformations for aligning videos or other spatio-temporal signals.

R. Li and R. Chellappa, Spatio-Temporal Alignment of Visual Signals on a Special Manifold. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(3):697-715, 2013. [PDF]

R. Li and R. Chellappa, Aligning spatio-temporal signals on a special manifold. European Conference on Computer Vision (ECCV), 2010. [PDF]

I also developed a functional optimization procedure to learn parametric probablistic models on the hyper-surface of normalized visual features.

R. Li, R. Chellappa, and S. Zhou, Recognizing Interactive Group Activities Using Temporal Interaction Matrices and Their Riemannian Statistics. International Journal of Computer Vision (IJCV), 101(2): 305-328, 2013. [PDF]

R. Li, R. Chellappa, and S. Zhou, Learning multi-modal densities on discriminative temporal interaction manifold for group activity recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009. [PDF]

A summary on recent advances in differntial geometric methods for pattern recognition and computer vision applications by my collaborators and myself can found below.

R. Li, P. Turaga, A. Srivastava, and R. Chellappa, Differential geometric representations and algorithms for some pattern recognition and computer vision problems. Pattern Recognition Letters (PRL), 2014, to appear. [Link]

Last update: 6/2014