FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images
FRESA. We present a novel method to reconstruct personalized skinned avatars with realistic pose-dependent animation all in a feed-forward approach, which generalizes to causally taken phone photos without any fine-tuning. We visualize predicted skinning weights associated with the most important joints in (b) and colormaps of per-vertex displacement magnitudes (normalized across all vertices to highlight large deformations) during animation in (c).
We present a novel method for reconstructing personalized 3D human avatars with realistic animation from only a few images. Due to the large variations in body shapes, poses, and cloth types, existing methods mostly require hours of per-subject optimization during inference, which limits their practical applications. In contrast, we learn a universal prior from over a thousand clothed humans to achieve instant feedforward generation and zero-shot generalization. Specifically, instead of rigging the avatar with shared skinning weights, we jointly infer personalized avatar shape, skinning weights, and pose-dependent deformations, which effectively improves overall geometric fidelity and reduces deformation artifacts. Moreover, to normalize pose variations and resolve coupled ambiguity between canonical shapes and skinning weights, we design a 3D canonicalization process to produce pixel-aligned initial conditions, which helps to reconstruct fine-grained geometric details. We then propose a multi-frame feature aggregation to robustly reduce artifacts introduced in canonicalization and fuse a plausible avatar preserving person-specific identities. Finally, we train the model in an end-to-end framework on a large-scale capture dataset, which contains diverse human subjects paired with high-quality 3D scans. Extensive experiments show that our method generates more authentic reconstruction and animation than state-of-the-arts, and can be directly generalized to inputs from casually taken phone photos.
We propose a novel method to feed-forwardly reconstruct personalized skinned avatars via a universal clothed human model. Specifically, given N frames of posed human images from front and back views, we first estimate their normal and segmentation images, and then unpose them for each frame and view to produce pixel-aligned initial conditions in a 3D canonicalization process. Next, we propose to aggregate mult-frame references and produce a single bi-plane feature as the representation of the subject identity. By sampling from this feature, we jointly decode personalized canonical avatar mesh, skinning weights and pose-dependent vertex displacement from a canonical tetrahedral grid. Finally, we adopt a multi-stage training process to train the model with posed-space ground truth and canonical-space regularization.
Our method produces plausible personalized canonical avatars from various source images.
Personalized skinning weights for smooth animation for various body shapes and cloth types.
Our model generates personalized pose dependent deformation, producing fine-grained wrinkles and dynamical effects for clothes, while also reduing LBS artifacts in animation.
When reposed to an unseen pose, we produce better results with reduced deformation artifacts and fine-grained wrinke details compared to baseline methods with nearest skinning.
By aggregating multiplue unposed results, our method produces a more plausible canonical shapes on loose parts (hairs and skirts) robust to noisy canonicalization.
Our model can be extended to produced textured meshes by learning a universal color feature.
@misc{wang2025fresafeedforwardreconstructionpersonalizedskinned,
title={FRESA:Feedforward Reconstruction of Personalized Skinned Avatars from Few Images},
author={Rong Wang and Fabian Prada and Ziyan Wang and Zhongshi Jiang and Chengxiang Yin and Junxuan Li and Shunsuke Saito and Igor Santesteban and Javier Romero and Rohan Joshi and Hongdong Li and Jason Saragih and Yaser Sheikh},
year={2025},
eprint={2503.19207},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.19207},
}