Title: Action-Conditioned 3D Human Motion Synthesis with Transformer VAE (ICCV 2021)
We tackle the problem of action-conditioned generation of realistic and diverse human motion sequences. In contrast to methods that complete, or extend, motion sequences, this task does not require an initial pose or sequence. Here we learn an action-aware latent representation for human motions by training a generative variational autoencoder (VAE). By sampling from this latent space and querying a certain duration through a series of positional encodings, we synthesize variable-length motion sequences conditioned on a categorical action. Specifically, we design a Transformer-based architecture, ACTOR, for encoding and decoding a sequence of parametric SMPL human body models estimated from action recognition datasets. We evaluate our approach on the NTU RGB+D, HumanAct12 and UESTC datasets and show improvements over the state of the art. Furthermore, we present two use cases: improving action recognition through adding our synthesized data to training, and motion denoising. Our code and models will be made available.
Mathis Petrovich received the MVA master’s degree in mathematics in 2019 and the master’s degree in computer science in 2020 from the École normale supérieure Paris-Saclay (ENS-Paris-Saclay). Currently, he is a PhD student working in collaboration between LIGM, in the Imagine research group of the École des Ponts ParisTech (ENPC) and the Perceiving Systems department of the Max Planck Institute for Intelligent Systems (MPI). This PhD is co-supervised by Gül Varol (ENPC), Mathieu Aubry (ENPC) and Michael J. Black (MPI).
Mathis Petrovich is interested in a variety of topics including human body analysis, computer vision, machine learning/deep learning, and optimal transport. His current research focuses on action-conditioned human motion generation.