DiscoScene: Spatially Disentangled Generative Radiance Field for Controllable 3D-aware Scene Synthesis

1CUHK    2Snap Inc.    3HKUST    4ZJU    5KAUST    6UCLA

DiscoScene performs controllable scene synthesis on various datasets.



We present DisCoScene: a 3D-aware generative model for high-quality and controllable scene synthesis.

The key ingredient of our approach is a very abstract object-level representation (3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. Moreover, it serves as an intuitive user control for scene editing.

Based on such a prior, our model spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with the global-local discrimination. Our model obtains the generation fidelity and editing flexibility of individual objects while being able to efficiently compose objects and the background into a complete scene. We demonstrate state-of-the-art performance on many scene datasets, including the challenging Waymo outdoor dataset.


Global Camera Control

Object Arrangement

Object Removal/Insertion

Object Restyling

Real Scene Editing


    author  = {Xu, Yinghao and Chai, Menglei and Shi, Zifan and Peng, Sida and Skorokhodov Ivan and Siarohin Aliaksandr and Yang, Ceyuan and Shen, Yujun and Lee, Hsin-Ying and Zhou, Bolei and Tulyakov Sergy},
    title   = {DiscoScene: Spatially Disentangled Generative Radiance Field for Controllable 3D-aware Scene Synthesis},
    journal = {arxiv: 2212.11984},
    year    = {2022},


We thank Jiatao Gu, Willi Menapace, Jian Ren, Panos Achlioptas, Tai Wang, and Zian Wang for fruitful discussions and comments about this work.

The website template was adapted from Nerfies.