One-bit compressed sensing and fast binary embeddings
- 1-Bit komprimierte Erfassung und schnelle binäre Einbettungen
Stollenwerk, Alexander; Dirksen, Sjoerd (Thesis advisor); Rauhut, Holger (Thesis advisor)
Aachen (2019, 2020)
Dissertation / PhD Thesis
Dissertation, RWTH Aachen University, 2019
Abstract
The focus of this dissertation lies on two related problems from data science: estimation of structured high-dimensional vectors from few binary measurements, and the design of computationally efficient binary embedding methods. While the former problem is more broadly studied in the field of compressed sensing, the latter task belongs to the field of dimensionality reduction. Both problems are linked by the joint assumption, inspired by empirical observations in numerous applications, that the data vectors considered exhibit some type of low-complexity structure. Furthermore, there is a geometric connection between both problems, which becomes apparent when visualizing the considered binary embedding, respectively sensing maps as being induced by hyperplane tessellations of the data set. As a consequence, not only the considered complexity measures are the same, but also the applied theoretical tools are very similar. For the task of signal estimation from binary measurements we investigate the recovery performance of a specific convex reconstruction program, called hinge loss minimization. The program is based on constrained empirical risk minimization and uses the hinge loss function as a data fidelity term. While successfully applied in classifications tasks, the program's ability to estimate a specific signal vector is only poorly understood. A major difficulty is that the hinge loss is just piecewise linear, so that its "curvature energy" is concentrated in a single point. This is substantially different from other popular loss functions considered in signal estimation, e.g., the square or logistic loss, which are at least locally strongly convex. Still, our reconstruction results show that signal estimation from one-bit measurements via hinge loss minimization is feasible under fairly general signal model conditions and even in the presence of strong noise that corrupts the quantization process. Regarding the task of fast binary dimensionality reduction, we consider embedding maps that first linearly map a vector to a lower-dimensional Euclidean space and then quantize the embedded vector by taking the sign component-wise. More specifically, we show that such maps, where the linear transformation is given by a construction involving Gaussian circulant matrices, can be used for the task of preserving pairwise angular distances between data vectors. Further, by additionally adding designed noise prior to the quantization step, these binary maps induced by Gaussian circulant matrices also succeed at the task of preserving Euclidean distances. The embedding maps can be computed efficiently, meaning that every vector can be embedded in near-linear time, and for finite sets the maps (almost) attain the minimal embedding dimension possible. Finally, we show that by considering so-called dithered Gaussian circulant binary embeddings, also infinite, but low-complexity and bounded data sets can be efficiently embedded, in such a way that all pairwise Euclidean distances are approximately preserved.
Institutions
- Department of Mathematics [110000]
- Chair of Mathematics of Information Processing [114510]
Identifier
- DOI: 10.18154/RWTH-2020-00296
- RWTH PUBLICATIONS: RWTH-2020-00296