Andrew Miller, Albert Wu, Jeffrey Regier, Jon McAuliffe, Dustin Lang, Mr Prabhat, David Schlegel, Ryan Adams
Neural Information Processing Systems 2015 [pdf]
We propose a method for combining two sources of astronomical data, spectroscopy and photometry, which carry information about sources of light (e.g., stars, galaxies, and quasars) at extremely different spectral resolutions. Our model treats the spectral energy distribution (SED) of the radiation from a source as a latent variable, hierarchically generating both photometric and spectroscopic observations. We place a flexible, nonparametric prior over the SED of a light source that admits a physically interpretable decomposition, and allows us to tractably perform inference. We use our model to predict the distribution of the redshift of a quasar from five-band (low spectral resolution) photometric data, the so called "photo-z" problem. Our method shows that tools from machine learning and Bayesian statistics allow us to leverage multiple resolutions of information to make accurate predictions with well-characterized uncertainties.
Vasileios Lampos, Andrew Miller, Steve Crossan, and Christian Stefansen
Scientific Reports [pdf]
This paper presents an improvement on the Google Flu Trends model, an epidemiological surveillance tool for measuring the current rate of influenza like illness (ILI) in the population. These methods relate patterns in user search queries to historical influenza estimates to obtain real-time ILI estimates. We develop a non-linear model based on Gaussian processes and a family of autoregressive models. We compare it to many already proposed methods, assessing predictive performance over five years of flu seasons, 2008-2013, and show that it obtains state of the art predictive performance.
Jeffrey Regier, Andrew Miller, Jon McAuliffe, Ryan Adams, Matt Hoffman, Dustin Lang, David Schlegel, Mr Prabhat
Proceedings of The 32nd International Conference on Machine Learning, pp. 2095–2103, 2015 [link]
We present a new, fully generative model of optical telescope image sets, along with a variational procedure for inference. Each pixel intensity is treated as a Poisson random variable, with a rate parameter dependent on latent properties of stars and galaxies. Key latent properties are themselves random, with scientific prior distributions constructed from large ancillary data sets. We check our approach on synthetic images. We also run it on images from a major sky survey, where it exceeds the performance of the current state-of-the-art method for locating celestial bodies and measuring their colors.
Alexander Franks, Andrew Miller, Luke Bornn, and Kirk Goldsberry
We develop a spatial model to analyze the defensive ability of professional basketball players. We first define two preprocessing steps to find a representation of players and posessions, and then we define a parametric model with effects that correspond to interpretable defensive ability.
Alexander Franks*, Andrew Miller*, Luke Bornn, and Kirk Goldsberry
best paper award. press:
This paper describes some advanced defensive metrics for NBA basketball, derived from player tracking data. We use the who's guarding whom model from this paper to define a new suite of metrics designed to measure how suppressive and disruptive players are on average, and throughout the entire possession.
Andrew Miller, Luke Bornn, Ryan Adams and Kirk Goldsberry
International Conference on Machine Learning (ICML), 2014 [arxiv]
We develop a dimensionality reduction method that can be applied to collections of point processes on a common space. Using this representation, we analyze the shooting habits of professional basketball players, create a new characterization of offensive player types and model shooting efficiency.
Andrew Miller, Vishal Jain and Joseph L. Mundy
IEEE Transactions on Parallel and Distributed Systems (submitted for review)
This paper presents a scalable system of multiple GPUs and CPUs to reconstruct dense 3-d models. This is a continuation Miller 2011 (which constructed models of size ~ 1 billion voxels) that extends the system to models in the 50-100 billion voxel range. Results are shown for building a 3-d model of an area of about 2 square kilometers (< 1 meter resolution) represented by 50 billion voxels over 4 GPUs in near real-time.
Vishal Jain, Andrew Miller and Joseph L. Mundy
2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) pdf
This paper presents a system that fuses both optical and infrared imagery to build a volumetric model. We develop a technique to tightly register multiple volumetric models, and show the benefits of the multi-modal datasource by developing classifiers to label high level features of the landscape (road, sidewalk, pavement, buildings, etc.).
Andrew Miller, Vishal Jain and Joseph L. Mundy
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, ASPLOS 2011 pdf
We develop and optimize a parallel ray tracing-inspired algorithm for both constructing and rendering a high fidelity 3-d volumetric model from aerial imagery. This paper goes over the engineering effort to gain an 800x speedup over serial implementations using a single gpu.
Simple, lightweight dynamic time warping implementation (and visualization) in numpy/python/cython.
A python module for astronomical source discovery and classification.
Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to compute gradients, Sampyl uses autograd to compute gradients. However, you are free to write your own gradient functions, autograd is not necessary. This project was started as a way to use MCMC samplers by defining models purely with Python and numpy.