DepthGAN

Zifan Shi¹ Yujun Shen² Jiapeng Zhu¹ Dit-Yan Yeung¹ Qifeng Chen¹

¹HKUST ²ByteDance Inc.

[Paper] [Code] [Demo]

Figure: Left: High-fidelity images with the corresponding depths generated by DepthGAN. Right: The 3D reconstruction results.

Overview

Existing methods fail to learn high-fidelity 3D-aware indoor scene synthesis merely from 2D images. In this work, we fill in this gap by proposing DepthGAN, which incorporates depth as a 3D prior. Concretely, we propose a dual-path generator, where one path is responsible for depth generation, whose intermediate features are injected into the other path as the condition for appearance rendering. Such a design eases the 3D-aware synthesis with explicit geometry information. Meanwhile, we introduce a switchable discriminator both to differentiate real v.s. fake domains, and to predict the depth from a given input. In this way, the discriminator can take the spatial arrangement into account and advise the generator to learn an appropriate depth condition. Extensive experimental results suggest that our approach is capable of synthesizing indoor scenes with impressively good quality and 3D consistency, significantly outperforming state-of-the-art alternatives.

Results

Qualitative comparison between DepthGAN and existing alternatives.

Diverse synthesis via varying the appearance latent code, conditioned on the same depth image.

Diverse geometries via varying the depth latent code, rendered with the same appearance style.

Demo

We include a demo video, which shows the continuous 3D control achieved by our DepthGAN.

BibTeX

  @article{shi20223daware,
    title   = {3D-Aware Indoor Scene Synthesis with Depth Priors},
    author  = {Shi, Zifan and Shen, Yujun and Zhu, Jiapeng and Yeung, Dit-Yan and Chen, Qifeng},
    booktitle = {ECCV},
    year    = {2022}
  }

Related Work

Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, Yong-Liang Yang. HoloGAN: Unsupervised learning of 3D representations from natural images. ICCV, 2019.
Comment: Proposes voxelized and implicit 3D representations and then render it to 2D image space with a reshape operation.

Katja Schwarz, Yiyi Liao, Michael Niemeyer, Andreas Geiger. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. NeurIPS, 2020.
Comment: Proposes the generative radiance fields for 3D-aware image synthesis.

Michael Niemeyer, Andreas Geiger. GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields. CVPR, 2021.
Comment: Proposes the compositional generative neural feature fields for scene synthesis.

Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, Gordon Wetzstein. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. CVPR, 2021.
Comment: Proposes the periodic implicit generative neural feature fields for 3d-aware image synthesis.

Atsuhiro Noguchi, Tatsuya Harada. RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis. ICLR, 2020.
Comment: Proposes the unsupervised depth learning for 3d-aware image synthesis.

ECCV 2022 (Oral)