Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models

ACM Transactions on Graphics

Willi Menapace, Aliaksandr Siarohin, Stéphane Lathuilière, Panos Achlioptas,
Vladislav Golyanik, Sergey Tulyakov, Elisa Ricci

Work performed while interning at Snap Inc.

Overview Dataset Paper GitHub

Animation module comparison to baselines

We evaluate our animation model against the Playable Environments baseline (PE) on the task of reconstructing a video from the initial state and actions for each player.

Minecraft

PE

Note the irrealistic player animations and lack in matching between text prompts and generated results.

Ours small

The full version of our model, trained with a reduced amount of computational resources, matching the one used for the baselines.

Ours

The full version of our model

Tennis

PE

Note the irrealistic player animations resulting from the model's inability to capture the multimodal distribution of player poses conditioned on text.

Ours small

The full version of our model, trained with a reduced amount of computational resources, matching the one used for the baselines.

Ours

The full version of our model