(SIG Asia 2019)miHoYo基于深度学习的布料动画工作流
本文禁止转载
B站:Heskey0
Learning an Intrinsic Garment Space for Interactive Authoring of Garment Animation(2019 SIG Asia)
Established workflows are either time and labor consuming (i.e., manual editing on dense frames with controllers), or lack keyframe-level control (i.e., physically-based simulation).
Instead, we present a deep-learning-based approach for semi-automatic authoring of garment animation, wherein the user provides the desired garment shape in a selection of keyframes, while our system infers a latent representation for its motion-independent intrinsic parameters (e.g., gravity, cloth materials, etc.)(说人话就是:艺术家编辑某一帧的garment shape,系统可以推测出物理和渲染的参数(intrinsic parameters),将这个修改传播到other frames)
Technically, we learn an intrinsic garment space with an motion-driven autoencoder network, where the encoder maps the garment shapes to the intrinsic space under the condition of body motions, while the decoder acts as a differentiable simulator to generate garment.
Chapter 1. Introduction
【传统CG领域】:
-
A common workflow in the modern CG industry for garment animation composition/editing is the keyframe approach. For each keyframe, the artist adjusts the garment shapes commonly with skinning techniques such as Linear Blend Skinning (LBS) [Kavan and Žára 2005] and Dual Quaternion Skinning (DQS) [Kavan et al. 2007]. The input garment shapes in the keyframes are then propagated to other frames via interpolation.(艺术家通过蒙皮调整某些帧的garment shape,插入动画序列然后通过插值传递到其它帧)
-
However, as the garment geometry is closely correlated to body motion, material properties, and environment, the garment shape space is exceedingly nonlinear and complex. In order to achieve physically plausible garment shapes consistent across motion, it requires very dense sample points for interpolation within such a space. Consequently, the keyframes must be densely distributed in the sequence (often as high as 20% of the frames), and hence it remains extremely labor-intensive(但因为布料的网格和body motion, material有关系,所以会很复杂。为了让布料更真实,就需要编辑非常密集的garment shape插值到动画序列,艺术家不得累死)
【新的方案】:
-
We propose a motion-invariant autoencoder neural network for our task.
-
Given a keyframe, the encoder learns to map its garment shape descriptor into a latent space under the condition of corresponding body motion. The latent vector can be interpreted as a latent representation of the intrinsic parameters. (all the garments generated with the same intrinsic parameters should be mapped to the same location in latent space by factoring out the body motion)
-
The decoder learns to reconstruct the garment geometry from a latent vector also under the condition of a particular motion. (it is a differentiable simulator for the automatic animation generation)
-
-
Motion information is incorporated into the autoencoder via a motion descriptor learned from a motion encoder. (Following the idea of Phase-Functioned Neural Network [Holden et al. 2017])
- the motion descriptor represents a set of coefficients which linearly blend the multiple sub-networks within an autoencoder layer. Thus, the network weights are updated dynamically according to the motion
-
The encoder, decoder, and the motion descriptor are jointly trained.
【contributions】:
- a novel semi-automatic pipeline for authoring garment animation.
- learning a motion-factorized latent space that encodes intrinsic information of the garment shape.
- learning a differentiable garment simulator to automatically reconstruct garment shapes from an intrinsic garment representation and target body motion.
Chapter 2. Related Work
【Garment Simulation】: Data-driven methods
- much of recent work learns from offline simulations to achieve real time performance.
- Other works focus on the transfer of simulated garments to different body shapes and poses.
【Garment Capture】:
- As an alternative to simulation, garment capture methods aim to faithfully reconstruct the garment animation from captured data
【Motion Control via Neural Networks】:
- Inspired by PFNN(Phase Function Neural Network), we aim to control the garment shapes with the body motion. Instead of using a phase function, we utilize a motion encoder to learn a motion descriptor from the body movement as the coefficients to linearly blend the sub-networks in each layer to update the network weights
Chapter 3. Approach
3.1 Overview
跟Introduction那一节一样
3.2 Data Representation
【Motion】:
- We describe the body motion as the pose aggregation of the current frame and past frames.
- The pose is represented by the 3D positions of body joints, so we have a pose matrix for a skeleton with joints.
- The motion signature is defined as the pose matrix for the current and past frames to describe the status of a specific moment.
【Garment】:
- We assume the garment to be dressed on the character is deformed from a template mesh .
- is a matrix that stores the 3D position of the vertices.
- stores the faces of the triangular mesh.
- At a frame with motion status , the garment shape is represented by .
【Intrinsic parameter】:
- Explicit intrinsic parameters includes simulator parameters, environment parameters, and garment material parameters.
【Dataset】:
- We organize the dataset as a collection of including garment shape , the motion signature , and the simulation parameters .
- The intrinsic vector can be interpreted as a latent representation of learned by our network.
3.3 Shape Feature Descriptor
we use MLP to encode and decode . We train this network via combination of two types of loss:
- a loss between the 3D vertex positions of the input shape and the shape after encoding and decoding
- a loss between the mesh Laplacian [Taubin 1995] on the vertices of those two to preserve surface details
Thus, the combined loss function is defined as:
The shape descriptor .
We train our shape descriptor using only the garment shapes from our dataset.
3.4 Motion Invariant Encoding
The motion signature is provided as a condition for the mapping functions.
- the encoder mapping the input shape descriptor into the latent space , and a decoder reconstructs the shape descriptor from .
The training loss is defined as:
where:
- is the input shape descriptor,
- is the recovered shape descriptor
- is the latent vector
- is the variance(方差) of in the batch
loss的含义:
- The first term aims to minimize the variance in the latent space within the same batch, as the input generated with the same are supposed to be mapped to the same location in the latent space.
- The second term acts as a regularizer to penalize the difference between the input shape descriptor and the recovered one, so as to ensure the latent space will not degenerate to zero or an arbitrary constant
The motion descriptor .
The motion-invariant autoencoder and the motion encoder are jointly trained.
3.5 Refinement
It is not guaranteed that the predicted garment shape is always collision-free. We apply an efficient refinement step to drag the garment outside the body while preserving its local shape feature.
Specifically, given a body shape and inferred garment mesh , we detect all the garment vertices inside as . For each vertex , we find its closest point over the body surface with position and normal . Then we deform the garment mesh to update the garment vertices by minimizing the following energy:
含义:
- The first term penalizes the Laplacian difference between the deformed mesh and the inferred mesh
- the second term forces the garment vertices inside body to move outwards with being a small value ensuring the garment vertices lie sufficiently outside the body
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南