We show results on expressions unseen during training. In these examples, we extract 3DMM coefficients from the Expression Target (left) and apply them to our target subject on the right. Note how both MVP and RGCA often fails to properly reconstruct the expressions.
Furthermore, although quality is overall fairly comparable, we are able to reconstruct certain finer details better than RGCA. Both our method and RGCA significantly outperform GaussianAvatars, MVP and Gaussian Head Avatars in terms of quality.
Note that since the multiface dataset does not contain eyeball meshes, no method is able to reconstruct gaze correctly.
In order to show that our method does not improve purely through the use of additional gaussians, we additionally provide comparisons with our method using a roughly equal number of gaussians. Specifically, RGCA was trained using 1M gaussians, and Gaussian Avatars typically densifies up to ~200K gaussians. Note how despite some additional artifacts and blurrier hair, our method is able to retain quality even with only 200K gaussians.