MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space

Armand Comas1,2, Di Qiu1, Menglei Chai1, Marcel Buehler1,3, Amit Raj1, Ruiqi Gao4, Qiangeng Xu1, Mark Matthews1, Paulo Gotardo1, Sergio Orts-Escolano1, Thabo Beeler1,
1Google, 2Northeastern Univeristy, 3ETH Zurich, 4Google DeepMind

Abstract

We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts to enhance user engagement and customization. Central to our approach are key innovations aimed at overcoming the challenges in photo-realistic avatar synthesis. Firstly, we utilize a conditional Neural Radiance Fields (NeRF) model, trained on a large-scale unannotated multi-view dataset, to create a versatile initial solution space that accelerates and diversifies avatar generation. Secondly, we develop a geometric prior, leveraging the capabilities of Text-to-Image Diffusion Models, to ensure superior view invariance and enable direct optimization of avatar geometry. These foundational ideas are complemented by our optimization pipeline built on Variational Score Distillation (VSD), which mitigates texture loss and over-saturation issues. As supported by our extensive experiments, these strategies collectively enable the creation of custom avatars with unparalleled visual quality and better adherence to input text prompts.




Non-personalized Avatar Generation

MagicMirror can generate widely-know real and fictional characters with high quality, realism and alignment to the text prompt.

From left to right we show the final rendering, geometric normals, depths and transparency maps.

Barack Obama

Frida Kahlo

Michael Jackson

Leo Messi

Christiano Ronaldo

Hillary Clinton

Morgan Freeman

Tom Hanks

Oprah Winfrey

Will Smith

Donald Trump

Joe Biden

Albert Einstein

Nikola Tesla

Daenerys Targaryen

Margot Robbie


Personalized Avatar Editing

MagicMirror can capture a subjects identity by means of a reconstructed avatar or else a selection of in-the-wild photos. Editing is carried on as a generation process, by recontextualizing the captured identity through text prompts.

Subject 1

Original Avatar

Expressions: Sad

Expressions: Happy

Expressions: Angry

Expressions: Shocked

Accessories: Headphones

Stylization: Old Person

Stylization: Joker

Stylization: Marble Statue

Subject 2

Original Avatar

Expressions: Sad

Expressions: Happy

Accessories: Headphones

Accessories: Mustache

Stylization: The Joker

Stylization: Marble Statue

Subject 3

Original Avatar

Expressions: Sad

Expressions: Angry

Accessories: Headphones

Accessories: Mustache

Stylization: The Joker

Stylization: Old person

AvatarStudio Subject 1

Original Identity (30 Views, can have different expressions)

Expressions: Sad

Expressions: Happy

Expression: Angry

Stylization: the Grinch

AvatarStudio Subject 2

Original Identity (30 Views, can have different expressions)

Stylization: Blue Hair

Stylization: Old person

Stylization: Vincent Van Gogh

Stylization: the Grinch

AvatarStudio Subject 3

Original Identity (30 Views, can have different expressions)

Stylization: Zombie

Stylization: Old person

Stylization: Bronze Statue

Stylization: the Joker


More results

Optimization trajectory: from sad to happy

Alternative Joker

Joker without Green

Alternative Grinch

Alternative Van Gogh Look Alike

Subject 1 Wear Glasses

Subject 1 Child Version

Subject 3 Child Version

Subject 3 Marble Statue

Cruela Custume

Alternative Hillary