FaceFolds

Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic Faces

Abstract

3D rendering of dynamic face captures is a challenging problem, and it demands improvements on several fronts—photorealism, efficiency, compatibility, and configurability. We present a novel representation that enables high-quality volumetric rendering of an actor's dynamic facial performances with minimal compute and memory footprint. It runs natively on commodity graphics soft- and hardware, and allows for a graceful trade-off between quality and efficiency. Our method utilizes recent advances in neural rendering, particularly learning discrete radiance manifolds to sparsely sample the scene to model volumetric effects. We achieve efficient modeling by learning a single set of manifolds for the entire dynamic sequence, while implicitly modeling appearance changes as temporal canonical texture. We export a single layered mesh and view-independent RGBA texture video that is compatible with legacy graphics renderers without additional ML integration. We demonstrate our method by rendering dynamic face captures of real actors in a game engine, at comparable photorealism to state-of-the-art neural rendering techniques at previously unseen frame rates.



Methodology

We learn a 3D representation of a dynamic face sequence with a set of radiance manifolds, which are exported as a static layered mesh and an RGBA texture video. These assets can be rendered efficiently on any legacy renderers.



View Synthesis Results

Our method achieves state-of-the-art visual quality while facilitating very efficient rendering of dynamic sequences on traditional graphics software without any custom integration of machine learning pipelines.



Free-viewpoint Rendering on Unity

Our representation allows for >60 fps free-viewpoint volumetric rendering at 1.5K resolution on consumer hardware.



Frame Interpolation

We can export video textures with higher frame rates by interpolating between learned latent codes of consecutive frames.

  original

3X frame rate  

  original

3X frame rate  



Ablation Studies

We can trade off image quality with memory efficiency via mesh simplification and texture downsampling. Our manifolds are mostly smooth—hence the exported meshes can be decimated to as low as 3K triangles without sacrificing notable visual quality. Video textures can be subsampled to a specific resolution to meet the demands of the target application.

—  Mesh Resolution  —

 16 x 16

64 x 64

512 x 512 

—  Texture Resolution  —

 128 x 128

256 x 256

512 x 512 

—  Number of Manifolds  —

Using a sufficient number of manifolds is essential to attain photorealism and volumetric effects.

 N = 1

N = 4

N = 12 

 N = 1

N = 4

N = 12 



BibTeX Citation

@article{medin2024facefolds,
  author    = {Medin, Safa C. and Li, Gengyan and Du, Ruofei and Garbin, Stephan and Davidson, Philip and Wornell, Gregory W. and Beeler, Thabo and Meka, Abhimitra},
  title     = {FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic Faces},
  journal   = {Proceedings of the ACM in Computer Graphics and Interactive Techniques},
  year      = {2024},
}