Abstract:

We develop a principled way of identifying probability distributions whose independent and identically distributed (iid) realizations are compressible, i.e., can be approximated as sparse. We focus on the context of Gaussian random underdetermined linear regression (GULR) problems, where compressibility is known to ensure the success of estimators exploiting sparse regularization. We prove that many of the conventional priors revolving around probabilistic interpretations of the p-norm (p<=1) regularization algorithms are in fact incompressible in the limit of large problem sizes. To show this, we identify nontrivial undersampling regions in GULR where the simple least squares solution almost surely outperforms an oracle sparse solution, when the data is generated from a prior such as the Laplace distribution. We provide rules of thumb to characterize large families of compressible and incompressible priors based on their second and fourth moments. Generalized Gaussians and generalized Pareto distributions serve as running examples for concreteness. We then conclude with a study of the statistics of wavelet coefficients of natural images in the context of compressible priors.