Faculty Recruiting Support CICS

Learning to Rig Characters

13 Apr
Wednesday, 04/13/2022 8:00am to 9:30am
Zoom
PhD Dissertation Proposal Defense
Speaker: Zhan Xu

Abstract: 
With the emergence of 3D virtual worlds, 3D social media, and massive online games, the need for diverse, high-quality, animation-ready characters and avatars is greater than ever. To animate characters, artists hand-craft articulation structures, such as animation skeletons and part deformers, which require significant amount of manual and laborious interaction with 2D/3D modeling interfaces. This thesis presents deep learning methods that significantly automate the process of character rigging. Specifically, we present two methods, namely RigNet and APES:


(a) RigNet takes an input a static 3D shape in the form of a polygon mesh  and predicts an animation skeleton that matches the animator expectations in joint placement and topology. It also estimates surface skin weights which determine how the mesh is animated given the           different skeletal poses. In contrast to prior work that fits pre-defined skeletal templates with hand-tuned objectives, RigNet is able to automatically rig diverse characters, such as humanoids, bipeds, quadrupeds, toys, birds, with varying articulation structure and geometry. RigNet is based on a deep neural architecture that directly operates on the mesh representation. The architecture is trained on a diverse dataset of rigged models that we mined online and curated. The dataset includes 2.7K polygon meshes, along with their associated skeletons and corresponding skin weights.


(b) APES takes as input 2D raster images depicting a small set of poses of a character shown in a sprite sheet, and identifies articulated parts useful for rigging the character. APES uses a combination of neural network inference and integer linear programming to identify a compact set of articulated body parts, e.g. head, torso and limbs, that best reconstruct the input poses. Compared to RigNet that requires a large collection of training models with associated skeletons and skinning weights, APES' neural architecture relies on supervision from (i) pixel correspondences readily available in existing large cartoon image datasets (e.g., Creative Flow), (ii) a relatively small dataset of 57 cartoon characters segmented into  moving parts.


I will finally discuss my proposed work that will involve predicting animation rigs, driven from 3D mesh/point cloud sequences or multiple different poses that can further improve RigNet by disambiguating the placement of skeletal joints and connectivity, along with better estimation of 3D motion flow and tracking.

 

Advisor: Evangelos Kalogerakis

Join via Zoom