Fast SAM 3D Body: Accelerating SAM 3D Body for Real-Time Full-Body Human Mesh Recovery

1USC Physical Superintelligence (PSI) Lab 2University of California, San Diego 3NVIDIA 4Meta Reality Labs
Joint corresponding authors
Speed-accuracy overview of Fast SAM 3D Body

Fast SAM 3D Body achieves up to 10.9× end-to-end speedup and over 10,000× faster MHR-to-SMPL conversion, enabling real-time humanoid control from a single RGB stream.

Abstract

SAM 3D Body (3DB) achieves state-of-the-art accuracy in monocular 3D human mesh recovery, yet its inference latency of several seconds per image precludes real-time application. We present Fast SAM 3D Body, a training-free acceleration framework that reformulates the 3DB inference pathway to achieve interactive rates. By decoupling serial spatial dependencies and applying architecture-aware pruning, we enable parallelized multi-crop feature extraction and streamlined transformer decoding. Moreover, to extract the joint-level kinematics (SMPL) compatible with existing humanoid control and policy learning frameworks, we replace the iterative mesh fitting with a direct feedforward mapping, accelerating this specific conversion by over 10,000×. Overall, our framework delivers up to a 10.9× end-to-end speedup while maintaining on-par reconstruction fidelity, even surpassing 3DB on benchmarks such as LSPET. We demonstrate its utility by deploying Fast SAM 3D Body in a vision-only teleoperation system that—unlike methods reliant on wearable IMUs—enables real-time humanoid control and the direct collection of manipulation policies from a single RGB stream.


Real-Time Humanoid Teleoperation

Fast SAM 3D Body enables real-time, vision-only teleoperation of the Unitree G1 humanoid robot at ~65 ms latency per frame on an NVIDIA RTX 5090. The system directly translates SMPL kinematics for robotic control, enabling collection of whole-body manipulation policies from a single RGB stream.

Object Manipulation

Lifting Box

Kneeling & Lower Body

Alternating Knee Switch

Stand Up to Half-Kneeling

Single-Knee Lift

Left Side-on Kneel

Right Side-on Kneel

Single-Knee Kneel

Upper Body & Full Body

Crouch and Rotate

Raise Hands and Squat

Arm Raise with Swivel

Upper-Body Gestures

Raise Elbow

Locomotion

Turn and Move Forward

Forward and Backward

Move Forward


Qualitative Results

Visual comparison between SAM 3D Body and our accelerated Fast SAM 3D Body on diverse in-the-wild images. Our method preserves high-fidelity reconstruction quality across challenging scenarios.

Qualitative comparison with SAM 3D Body

BibTeX

@article{yang2026fastsam3dbody,
      title={Fast SAM 3D Body: Accelerating SAM 3D Body for Real-Time Full-Body Human Mesh Recovery},
      author={Yang, Timing and He, Sicheng and Jing, Hongyi and Yang, Jiawei and Liu, Zhijian and Zou, Chuhang and Wang, Yue},
      journal={arXiv preprint arXiv:2603.15603},
      year={2026}
}