Figure: Left: High-fidelity images with the corresponding depths generated by DepthGAN.
Right: The 3D reconstruction results.
Overview
Existing methods fail to learn high-fidelity 3D-aware indoor scene synthesis merely
from 2D images. In this work, we fill in this gap by proposing DepthGAN, which
incorporates depth as a 3D prior. Concretely, we propose a dual-path generator, where
one path is responsible for depth generation, whose intermediate features are injected
into the other path as the condition for appearance rendering. Such a design eases the
3D-aware synthesis with explicit geometry information. Meanwhile, we introduce
a switchable discriminator both to differentiate real v.s. fake domains, and
to predict the depth from a given input. In this way, the discriminator can take the
spatial arrangement into account and advise the generator to learn an appropriate depth
condition. Extensive experimental results suggest that our approach is capable of
synthesizing indoor scenes with impressively good quality and 3D consistency, significantly
outperforming state-of-the-art alternatives.
Results
Qualitative comparison between DepthGAN and existing alternatives.
Diverse synthesis via varying the appearance latent code, conditioned on the same depth image.
Diverse geometries via varying the depth latent code, rendered with the same appearance style.
Demo
We include a demo video, which shows the continuous 3D control achieved by our DepthGAN.
BibTeX
@article{shi20223daware,
title = {3D-Aware Indoor Scene Synthesis with Depth Priors},
author = {Shi, Zifan and Shen, Yujun and Zhu, Jiapeng and Yeung, Dit-Yan and Chen, Qifeng},
booktitle = {ECCV},
year = {2022}
}
Comment: Proposes voxelized and implicit 3D representations and then render it to 2D image space with a reshape operation.