TGA: True-to-Geometry Avatar Dynamic Reconstruction

Bo Guo1, Sijia Wen1 † , Ziwei Wang1, Yifan Zhao2,
1Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing,
School of Artificial Intelligence, Beihang University
2State Key Laboratory of Virtual Reality Technology and Systems, SCSE&QRI, Beihang University
(Corresponding author)

NeurIPS 2025 Spotlight
teaser

Example results of our TGA. Our method generates (a) high-fidelity, frame-by-frame textured meshes from multi-view videos, (b) captures avatar-specific details across wide viewing angles, and (c) delivers realistic cross-reenactment performance.

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) have improved the visual fidelity of dynamic avatar reconstruction. However, existing methods often overlook the inherent chromatic similarity of human skin tones, leading to poor capture of intricate facial geometry under subtle appearance changes. This is caused by the affine approximation of Gaussian projection, which fails to be perspective-aware to depth-induced shear effects. To this end, we propose True-to-Geometry Avatar Dynamic Reconstruction (TGA), a perspective-aware 4D Gaussian avatar framework that sensitively captures fine-grained facial variations for accurate 3D geometry reconstruction. Specifically, to enable color-sensitive and geometry-consistent Gaussian representations under dynamic conditions, we introduce Perspective-Aware Gaussian Transformation that jointly models temporal deformations and spatial projection by integrating Jacobian-guided adaptive deformation into the homogeneous formulation. Furthermore, we develop Incremental BVH Tree Pivoting to enable fast frame-by-frame mesh extraction for 4D Gaussian representations. A dynamic Gaussian Bounding Volume Hierarchy (BVH) tree is used to model the topological relationships among points, where active ones are filtered out by BVH pivoting and subsequently re-triangulated for surface reconstruction. Extensive experiments demonstrate that TGA achieves superior geometric accuracy.

Method

Pipeline
Given multi-view RGB sequences, we first track facial dynamics with FLAME (a). During the Perspective-aware Transformation (b), we apply Jacobian-guided deformation and homogeneous projection for accurate geometric modeling. After (c), we build and dynamically update a Gaussian BVH (d), where BVH pivoting adaptively filters hopping points, enabling geo-accurate surface extraction via Marching Tetrahedra (rightmost column).

Other Results

Novel view synthesis


Self-reenactment

Related Links

There's a lot of excellent work that was introduced around the same time as ours.

GaussianAvatars pioneeringly rigs Gaussian point clouds to a parametric morphable face model.

Topo4D builds temporally consistent 4D topology and high-fidelity textures for dynamic scene reconstruction across frames.

SurFhead reconstructs geometrically accurate avatars using 2D Gaussian surfels.

We kindly recommend checking out Gaussian Opacity Fields, which leverages ray-tracing-based volume rendering of 3D Gaussians to directly extract geometry via level-set identification and adaptive Marching Tetrahedra.