ShellNeRF: Learning a Controllable High-resolution Model of the Eye and Periocular Region

Eye gaze and expressions are crucial non-verbal signals in face-to-face communication. Visual effects and telepresence demand significant improvements in personalized tracking, animation, and synthesis of the eye region to achieve true immersion. Morphable face models, in combination with coordinate-based neural volumetric representations, show promise in solving the difficult problem of reconstructing intricate geometry (eyelashes) and synthesizing photorealistic appearance variations (wrinkles and specularities) of eye performances. We propose a novel hybrid representation - ShellNeRF - that builds a discretized volume around a 3DMM face mesh using concentric surfaces to model the deformable ‘periocular’ region. We define a canonical space using the UV layout of the shells that constrains the space of dense correspondence search. Combined with an explicit eyeball mesh for modeling corneal light-transport, our model allows for animatable photorealistic 3D synthesis of the whole eye region. Using multi-view video input, we demonstrate significant improvements over state-of-the-art in expression re-enactment and transfer for high-resolution close-up views of the eye region.

BibTeX

@article {10.1111:cgf.15041,
journal = {Computer Graphics Forum},
title = {{ShellNeRF: Learning a Controllable High-resolution Model of the Eye and Periocular Region}},
author = {Li, Gengyan and Sarkar, Kripasindhu and Meka, Abhimitra and Buehler, Marcel and Mueller, Franziska and Gotardo, Paulo and Hilliges, Otmar and Beeler, Thabo},
year = {2024},
publisher = {The Eurographics Association and John Wiley & Sons Ltd.},
ISSN = {1467-8659},
DOI = {10.1111/cgf.15041}
}

Novel View Synthesis

We can render the same scene from continuous camera views. Unlike other methods, we ensure multiview consistency and do not "hide" wrinkles and shadows beneath the skin surface. Nerface suffers from high instability and often diverges, as seen in the second subject

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Decomposition

We can also decompose each video into albedo, diffuse shading and specular shading.

Albedo Diffuse Specular

Regazing

Our method can control eye gaze and synthesize novel view directions. Note how our method is significantly sharper and moves smoother than all other competing methods. Mixture of Volumetric Primitives(MVP) requires a projected texture as an input, which is not given for such manipulations. We therefore use a neutral texture as a placeholder. MVP is unable to regaze properly, instead blending from one eye pose to the next. On the other hand, the lack of overall quality prevents the eyeball in EyeNeRF from being learned correctly for one pose of the first subject, and generally for the second subject.

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Regazing with a Moving Camera

We can do the same with a moving camera.

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Interpolating Expressions

We interpolate between 13 expressions. Related works struggle to render convincing expressions. For our method, the periocular region smoothly adapts and shows detailed, naturally deforming wrinkles and highly detailed reflections on the eyeball. Again, for MVP we use the neutral texture as a placeholder.

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Interpolating Expressions with a Moving Camera

We can do also perform interpolation of expressions while moving the camera and maintain 3D consistency throughout the motion. SotA methods struggle with gaze and expression changes and produce significant floaters for novel camera viewpoints.

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

3DMM Expressions

Our shell-based formulation enables fine-grained control via a 3DMM parameters. We show slow-motion renderings of a highly complex expression: a closing eyelid. Only our method and MonoAvatar are capable of this.

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Reenactment

We show results on expressions unseen during training. In these examples, we extract 3DMM coefficients from the Expression Target (left) and apply them to our target subject on the right. Note how our method faithfully applies the desired expressions. Note that our 3DMM fitting method does not enforce temporal smoothness, resulting in jitter. As we crop our image based on the estimated eye pose, this results in jitter in the GT video as well. Although performs comparably in most situations, it is unable to handle certain expressions which it did not directly see, such as the half open eye in the first expression, resulting in strong artifacting.

ShellNeRF: Learning a Controllable High-resolution Model of the Eye and Periocular Region

We present ShellNeRF - a novel method for high-resolution novel view synthesis and animation of the periocular face region. Our method allows for controlling expressions and eye gaze and renders novel views at an unprecedented level of detail.

Abstract

BibTeX

Novel View Synthesis

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Decomposition

Albedo Diffuse Specular

Albedo Diffuse Specular

Regazing

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Regazing with a Moving Camera

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Interpolating Expressions

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Interpolating Expressions with a Moving Camera

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

3DMM Expressions

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Ours MVP Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Reenactment

Expression Target Ours MVP

Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar

Expression Target Ours MVP

Nerface + EyeNeRF 3DMM cond. EyeNeRF MonoAvatar