DELTAv2: Accelerating Dense 3D Tracking


Tuan Duc Ngo1,2 Ashkan Mirzaei1 Guocheng Qian1 Hanwen Liang1,4 Chuang Gan2 Evangelos Kalogerakis2,3 Peter Wonka1,5 Chaoyang Wang1
1 Snap Inc 2 UMass Amherst 3 TU Crete 4 University of Toronto 5 KAUST

[Paper] [Arxiv] [Code]






DELTAv2 accelerates dense 3D tracking by a factor of 5 compared to DELTA, while achieving comparable performance.

Abstract

We propose a novel algorithm for accelerating dense long-term 3D point tracking in videos. Through analysis of existing state-of-the-art methods, we identify two major computational bottlenecks. First, transformer-based iterative tracking becomes expensive when handling a large number of trajectories. To address this, we introduce a coarse-to-fine strategy that begins tracking with a small subset of points and progressively expands the set of tracked trajectories. The newly added trajectories are initialized using a learnable interpolation module, which is trained end-to-end alongside the tracking network. Second, we propose an optimization that significantly reduces the cost of correlation feature computation, another key bottleneck in prior methods. Together, these improvements lead to a 5–100x speedup over existing approaches while maintaining state-of-the-art tracking accuracy.

Motivation



Despite the efficient transformer design, DELTA's iterative refinement is computationally expensive. Our coarse-to-fine approach tracks sparse points first, then progressively densifies using a learnable interpolator. Combined with optimized correlation feature computation, we achieve 5-100x speedup while maintaining accuracy.

Method



Our coarse-to-fine iterative dense tracking reduces computation by subsampling trajectory points in early iterations and progressively increasing the density across iterations. A learnable interpolation module leverages attention to infer untracked motions from nearby tracked pixels, enabling efficient and adaptive trajectory propagation.

Results





Comparison with 3D tracking approaches: SceneTracker, SpatialTracker, and DELTA







More results of 3D dense tracking can be found here.

Comparison with 2D tracking approaches (3D-lifted with depth): CoTracker and LocoTrack






More results of 3D dense tracking can be found here.