Noisy, Missing and Corrupted Data
Many models for sparse regression typically assume that the covariates are known completely, and without noise. Particularly in high-dimensional applications, this is often not the case. In this talk, we develop efficient OMP-like algorithms to deal with precisely this setting. Our algorithms are as efficient as OMP, and improve on the best-known results for missing and noisy data in regression, both in the high-dimensional setting where we seek to recover a sparse vector from only a few measurements, and in the classical low-dimensional setting where we recover an unstructured regressor. In the high-dimensional setting, our support-recovery algorithm requires no knowledge of even the statistics of the noise. Along the way, we also obtain improved performance guarantees for OMP for the standard sparse regression problem with Gaussian noise. Time permitting, we present some results on the more difficult setting of arbitrarily corrupted covariates.