Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models

ACM Transactions on Graphics

Willi Menapace, Aliaksandr Siarohin, Stéphane Lathuilière, Panos Achlioptas,
Vladislav Golyanik, Sergey Tulyakov, Elisa Ricci

Work performed while interning at Snap Inc.

Overview Dataset Paper GitHub

PGM Datasets

To build Promptable Game Models, we contribute two annotated monocular video datasets that we will make publicly available. Differently from existing text-video datasets with a single and generic caption per video or captions weakly aligned to video content, our datasets feature a text action for each player and frame in the video that describes in detail what the player is doing using technical terms.

Tennis

15 hours of annotated cameras, 3D skeletons, 3D ball and manually-annotated actions for each frame and player.

Minecraft

1 hour of annotated cameras, 3D skeletons and actions for each frame. Annotation automatically produced using a Minecraft plugin we will make publicly available.