# Disentangling Geometry and Appearance with Regularised Geometry-Aware Generative Adversarial Networks

Authors
• 1 Imperial College London, London, UK , London (United Kingdom)
• 2 Middlesex University London, London, UK , London (United Kingdom)
• 3 Samsung AI, Cambridge, UK , Cambridge (United Kingdom)
Type
Published Article
Journal
International Journal of Computer Vision
Publisher
Springer-Verlag
Publication Date
Mar 02, 2019
Volume
127
Issue
6-7
Pages
824–844
Identifiers
DOI: 10.1007/s11263-019-01155-7
Source
Springer Nature
Keywords
Deep generative models have significantly advanced image generation, enabling generation of visually pleasing images with realistic texture. Apart from the texture, it is the shape geometry of objects that strongly dictates their appearance. However, currently available generative models do not incorporate geometric information into the image generation process. This often yields visual objects of degenerated quality. In this work, we propose a regularized Geometry-Aware Generative Adversarial Network (GAGAN) which disentangles appearance and shape in the latent space. This regularized GAGAN enables the generation of images with both realistic texture and shape. Specifically, we condition the generator on a statistical shape prior. The prior is enforced through mapping the generated images onto a canonical coordinate frame using a differentiable geometric transformation. In addition to incorporating geometric information, this constrains the search space and increases the model’s robustness. We show that our approach is versatile, able to generalise across domains (faces, sketches, hands and cats) and sample sizes (from as little as ∼200-30,000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim \, 200{-}30{,}000$$\end{document} to more than 200, 000). We demonstrate superior performance through extensive quantitative and qualitative experiments in a variety of tasks and settings. Finally, we leverage our model to automatically and accurately detect errors or drifting in facial landmarks detection and tracking in-the-wild.