AToM: Amortized Text-to-Mesh using 2D Diffusion

Abstract

We introduce Amortized Text-to-Mesh (AToM), a feed-forward text-to-mesh framework optimized across multiple text prompts simultaneously. In contrast to existing text-to-3D methods that often entail time-consuming per-prompt optimization and commonly output representations other than polygonal meshes, AToM directly generates high-quality textured meshes in less than 1 second in inference with around 10 times reduction in the training cost, and generalizes to unseen prompts. Our key idea is a novel triplane-based text-to-mesh architecture with a two-stage training strategy that ensures stable optimization and scalability. Through extensive experiments on various prompt benchmarks, AToM significantly outperforms state-of-the-art amortized approaches with over 4 times higher accuracy (in DF415 dataset) and more distinguishable and higher-quality 3D outputs. AToM demonstrates strong generalizability, offering finegrained details of 3D content for unseen interpolated prompts, unlike per-prompt solutions.

Method

AToM proposes a triplane-based text-to-mesh architecture with a two-stage amortized optimization training that ensures stable optimization and scalability. AToM is optimized through score distillation sampling without 3D data.

Interpolation Experiments on Pig64

AToM generalizes to unseen interpolated prompts. Comparing AToM to AToM Per-Prompt on the Pig64 compositional prompt set in the format of ``a pig {activity} {theme}'', where each row and column represent a different activity and theme. Models are trained using 56 prompts and tested on all prompts, while the 8 unseen testing prompts are evaluated on the diagonal.

AToM generalizes to unseen prompts (diagonal from left up to right down)

Per-prompt text-to-3D cannot generalize and yields low consistency

AToM: Amortized Text-to-Mesh using 2D Diffusion

Guocheng Qian^1,2

Junli Cao¹

Aliaksandr Siarohin¹

Yash Kant^1,3

Chaoyang Wang¹

Michael Vasilkovsky¹

Hsin-Ying Lee¹

Yuwei Fang¹

Ivan Skorokhodov¹

Peiye Zhuang¹

Igor Gilitschenski³

Jian Ren¹

Bernard Ghanem²

Kfir Aberman¹

Sergey Tulyakov¹

¹Snap Inc.

²King Abdullah University of Science and Technology (KAUST)

³University of Toronto

Abstract

Method

Interpolation Experiments on Pig64

Interpolation Experiments on Animal2400

Main Results