MultiPhys: Multi-Person Physics-aware 3D Motion Estimation
Recovers multi-person motion from monocular videos by incorporating physics simulation for coherent spatial placement and robustness against occlusions and jittering.
Multi-Phys utilizes a physics simulation engine to recover motion for multiple interacting individuals in a physics-plausible manner. Instead of relying on detected 2D keypoints in a feedforward model, the approach initializes with the output from SLAHMR, which performs global optimization on the entire sequence. However, since SLAHMR is physics-agnostic, its outputs may be noisy, particularly regarding the spatial placement of bodies, leading to inter-person penetrations. To address this, a pipeline is devised where these preliminary body poses are fed into the physics simulator in an autoregressive fashion to obtain physically compliant motion estimates. Naively feeding these poses makes it challenging for the policy to generate the control signal to drive the simulation, resulting in motion degradation.
The framework projects all sequences into a physically plausible space, making it straightforward to convert these poses back to the SMPL representation. In terms of implementation, the method is evaluated on datasets with varying levels of interaction, such as CHI3D, Hi4D, and ExPI, each presenting different challenges in terms of inter-person interactions and dynamics. The proposed approach is compared to baseline methods, including EmbPose-MP and SLAHMR, showcasing improvements in reducing inter-person penetrations, foot skating, and ground penetration. An ablation study is conducted to analyze the impact of the loop-N component in the system, demonstrating its effectiveness in improving the match between simulated poses and kinematic reference poses, especially for highly dynamic motions.
Comments
None