Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models

ACM Transactions on Graphics

Willi Menapace, Aliaksandr Siarohin, Stéphane Lathuilière, Panos Achlioptas,
Vladislav Golyanik, Sergey Tulyakov, Elisa Ricci

Work performed while interning at Snap Inc.

Overview Dataset Paper GitHub

Animation module ablation

We evaluate the choice of the diffusion framework by comparing our method with an equivalent one trained using a reconstruction objective rather than the diffusion objective.

Minecraft

Reconstruction Transformer

Note the irrealistic player animations and lack in matching between text prompts and generated results.

Ours small

The full version of our model, trained with a reduced amount of computational resources, matching the one used for the baselines.

Ours

The full version of our model

Tennis

Reconstruction Transformer

Note the irrealistic player animations and player sliding artifacts.

Ours small

The full version of our model, trained with a reduced amount of computational resources, matching the one used for the baselines.

Ours

The full version of our model