Model-based analysis of the image generation quality of adversarial latent autoencoders for industrial machine vision
Yermakov, Ruslan; Berkels, Benjamin (Thesis advisor); Schmitt, Robert H. (Thesis advisor); Wolfschläger, Dominik (Consultant)
Aachen : RWTH Aachen University (2023)
Masterarbeit, RWTH Aachen University, 2022
In industrial machine vision applications, generative models have an advantage over discriminative models because they enable interpretable feature extraction and overcome the limitations of non-transparent and inexplicable decisions associated with the latter. The state-of-the-art style-based generative adversarial networks (StyleGANs) generate high-quality images mapped from a latent feature vector space by learning the approximation of the high-dimensional training data distribution. Thereby, learned representations can be identified by interpreting the latent space and used to control the properties of the synthesized images. StyleGANs in combination with an embedding algorithm - adversarial latent autoencoder (ALAE) - enable the assessment of the properties of embedded images through their latent space representations. However, in order to achieve the best possible approximation of the training data distribution, it is necessary to optimize the network's design parameters based on its effectiveness to learn the quality characteristics of industrial machine vision data. This requires a quantitative evaluation of the quality of generated, and reconstructed images in comparison to the quality of original images. This work presents an evaluation framework to assess the capabilities of generative ALAE models to learn the features and characteristics of the original data consistently and reliably. Feature consistency is proposed as an evaluation criterium to estimate the performance of the generative models. The quality of images generated by ALAE is quantitatively evaluated, with a focus on determining how the latent space size affects the outcome. Latent space size systematically varied during training on industrially relevant datasets with representative features. Based on the application-specific and human-interpretable features, the image quality of multiple ALAEs with varying latent space dimensionalities is compared using statistical tests to select the most favorable latent space dimension.
- DOI: 10.18154/RWTH-2023-03250
- RWTH PUBLICATIONS: RWTH-2023-03250