Avatarme: Realistically Renderable 3D Facial Reconstruction "In-The-Wild"

The document describes a new method called AvatarMe that can generate photorealistic 3D facial reconstructions from single in-the-wild images at high resolution. AvatarMe builds on state-of-the-art 3D reconstruction methods to infer facial shape, reflectance maps for diffuse and specular albedo, and high-frequency normals. It captures a large dataset to train models and can produce 4K by 6K resolution renderings with detailed reflections that bridge the uncanny valley. Existing methods cannot produce such high quality 3D faces from uncontrolled images due to limitations in representing high-frequency texture details.

Uploaded by

Joseph Omis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

171 views10 pages

Avatarme: Realistically Renderable 3D Facial Reconstruction "In-The-Wild"

Uploaded by

Joseph Omis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

AvatarMe: Realistically Renderable 3D Facial Reconstruction “in-the-wild”

Alexandros Lattas1,2 Stylianos Moschoglou1,2 Baris Gecer1,2 Stylianos Ploumpis1,2

Vasileios Triantafyllou2 Abhijeet Ghosh1 Stefanos Zafeiriou1,2
1 2
Imperial College London, UK FaceSoft.io
1 2
{a.lattas,s.moschoglou,b.gecer,s.ploumpis,ghosh,s.zafeiriou}@imperial.ac.uk [email protected]
arXiv:2003.13845v1 [cs.CV] 30 Mar 2020

I
nput I
nfe
rre
dRe
fle
cta
nce Re
nde
rin
gs He
adComp
let
ion

a
)Di
ff
useAl
bedo b
)Di
ff
useNo
rma
ls

c
)Sp
ecul
arAl
bedo d)Spe
cul
arNor
mal
s

Figure 1: From left to right: Input image; Predicted reflectance (diffuse albedo, diffuse normals, specular albedo and specular
normals); Rendered reconstruction in different environments, with detailed reflections; Rendered result with head completion.

Abstract 1. Introduction
Over the last years, with the advent of Generative Ad- The reconstruction of a 3D face geometry and texture is
versarial Networks (GANs), many face analysis tasks have one of the most popular and well-studied fields in the in-
accomplished astounding performance, with applications tersection of computer vision, graphics and machine learn-
including, but not limited to, face generation and 3D face ing. Apart from its countless applications, it demonstrates
reconstruction from a single “in-the-wild” image. Never- the power of recent developments in scanning, learning and
theless, to the best of our knowledge, there is no method synthesizing 3D objects [3, 44]. Recently, mainly due to
which can produce high-resolution photorealistic 3D faces the advent of deep learning, tremendous progress has been
from “in-the-wild” images and this can be attributed to the: made in the reconstruction of a smooth 3D face geometry,
(a) scarcity of available data for training, and (b) lack of ro- even from images captured in arbitrary recording conditions
bust methodologies that can successfully be applied on very (also referred to as “in-the-wild”) [13, 14, 33, 36, 37]. Nev-
high-resolution data. In this paper, we introduce AvatarMe, ertheless, even though the geometry can be inferred some-
the first method that is able to reconstruct photorealistic 3D what accurately, in order to render a reconstructed face in
faces from a single “in-the-wild” image with an increasing arbitrary virtual environments, much more information than
level of detail. To achieve this, we capture a large dataset a 3D smooth geometry is required, i.e., skin reflectance as
of facial shape and reflectance and build on a state-of-the- well as high-frequency normals. In this paper, we propose
art 3D texture and shape reconstruction method and succes- a meticulously designed pipeline for the reconstruction of
sively refine its results, while generating the per-pixel dif- high-resolution render-ready faces from “in-the-wild” im-
fuse and specular components that are required for realistic ages captured in arbitrary poses, lighting conditions and oc-
rendering. As we demonstrate in a series of qualitative and clusions. A result from our pipeline is showcased in Fig. 1.
quantitative experiments, AvatarMe outperforms the exist- The seminal work in the field is the 3D Morphable Model
ing arts by a significant margin and reconstructs authentic, (3DMM) fitting algorithm [3]. The facial texture and shape
4K by 6K -resolution 3D faces from a single low-resolution that is reconstructed by the 3DMM algorithm always lies in
image that, for the first time, bridges the uncanny valley. a space that is spanned by a linear basis which is learned
by Principal Component Analysis (PCA). The linear basis,

1
even though remarkable in representing the basic character- ily creating a digital avatar rather than high-quality render-
istics of the reconstructed face, fails in reconstructing high- ready face reconstruction from “in-the-wild” images which
frequency details in texture and geometry. Furthermore, the is the goal of our work.
PCA model fails in representing the complex structure of fa- In this paper, we propose the first, to the best of
cial texture captured “in-the-wild”. Therefore, 3DMM fitting our knowledge, methodology that produces high-quality
usually fails on “in-the-wild” images. Recently, 3DMM fit- render-ready face reconstructions from arbitrary images. In
ting has been extended so that it uses a PCA model on robust particular, our method builds upon recent reconstruction
features, i.e., Histogram of Oriented Gradients (HoGs) [8], methods (e.g., GANFIT [14]) and contrary to [6, 42] does not
for representing facial texture [4]. The method has shown apply algorithms for high-frequency estimation to the origi-
remarkable results in reconstructing the 3D facial geometry nal input, which could be of very low quality, but to a GAN-
from “in-the-wild” images. Nevertheless, it cannot recon- generated high-quality texture. Using a light stage, we have
struct facial texture that accurately. collected a large scale dataset with samples of over 200 sub-
With the advent of deep learning, many regression meth- jects’ reflectance and geometry and we train image transla-
ods using an encoder-decoder structure have been pro- tion networks that can perform estimation of (a) diffuse and
posed to infer 3D geometry, reflectance and illumination specular albedo, and (b) diffuse and specular normals. We
[6, 14, 33, 35, 36, 37, 39, 44]. Some of the methods demon- demonstrate that it is possible to produce render-ready faces
strate that it is possible to reconstruct shape and texture, from arbitrary faces (pose, occlusion, etc.) including por-
even in real-time on a CPU [44]. Nevertheless, due to var- traits and face sketches, which can be realistically relighted
ious factors, such as the use of basic reflectance models in any environment.
(e.g., the Lambertian reflectance model), the use of syn-
thetic data or mesh-convolutions on colored meshes, the 2. Related Work
methods [33, 35, 36, 37, 39, 44] fail to reconstruct highly-
detailed texture and shape that is render-ready. Further- 2.1. Facial Geometry and Reflectance Capture
more, in many of the above methods the reconstructed tex- Debevec et al. [9] first proposed employing a special-
ture and shape lose many of the identity characteristics of ized light stage setup to acquire a reflectance field of a
the original image. human face for photo-realistic image-based relighting ap-
Arguably, the first generic method that demonstrated that plications. They also employed the acquired data to es-
it is possible to reconstruct high-quality texture and shape timate a few view-dependent reflectance maps for render-
from single “in-the-wild” images is the recently proposed ing. Weyrich et al. [41] employed an LED sphere and 16
GANFIT method [14]. GANFIT can be described as an ex- cameras to densely record facial reflectance and computed
tension of the original 3DMM fitting strategy but with the view-independent estimates of facial reflectance from the
following differences: (a) instead of a PCA texture model, it acquired data including per-pixel diffuse and specular albe-
uses a Generative Adversarial Network (GAN) [23] trained dos, and per-region specular roughness parameters. These
on large-scale high-resolution UV-maps, and (b) in order to initial works employed dense capture of facial reflectance
preserve the identity in the reconstructed texture and shape, which is somewhat cumbersome and impractical.
it uses features from a state-of-the-art face recognition net- Ma et al. [27] introduced polarized spherical gradient
work [11]. However, the reconstructed texture and shape illumination (using an LED sphere) for efficient acquisi-
is not render-ready due to (a) the texture containing baked tion of separated diffuse and specular albedos and photo-
illumination, and (b) not being able to reconstruct high- metric normals of a face using just eight photographs, and
frequency normals or specular reflectance. demonstrated high quality facial geometry, including skin
Early attempts to infer photorealistic render-ready infor- mesostructure as well as realistic rendering with the ac-
mation from single “in-the-wild” images have been made quired data. It was however restricted to a frontal viewpoint
in the line of research of [6, 20, 32, 42]. Arguably, some of acquisition due to their employment of view-dependent
of the results showcased in the above noted papers are of polarization pattern on the LED sphere. Subsequently,
high-quality. Nevertheless, the methods do not generalize Ghosh et al. [15] extended polarized spherical gradient il-
since: (a) they directly manipulate and augment the low- lumination for multi-view facial acquisition by employ-
quality and potentially occluded input facial texture, instead ing two orthogonal spherical polarization patterns. Their
of reconstructing it, and as a result, the quality of the final method allows capture of separated diffuse and specular
reconstruction always depends on the input image. (b) the reflectance and photometric normals from any viewpoint
employed 3D model is not very representative, and (c) a very around the equator of the LED sphere and can be considered
small number of subjects (e.g., 25 [42]) were available for the state-of-the art in terms of high quality facial capture.
training for the high-frequency details of the face. Thus, Recently, Kampouris et al. [22] demonstrated how to em-
while closest to our work, these approaches focus on eas- ploy unpolarized binary spherical gradient illumination for
Figure 2: Overview of the proposed method. A 3DMM is fitted to an “in-the-wild” input image and a completed UV texture
is synthesized, while optimizing for the identity match between the rendering and the input. The texture is up-sampled 8
times, to synthesize plausible high-frequency details. We then use an image translation network to de-light the texture and
obtain the diffuse albedo with high-frequency details. Then, separate networks infer the specular albedo, diffuse normals
and specular normals (in tangent space) from the diffuse albedo and the 3DMM shape normals. The networks are trained on
512 × 512 patches and inferences are ran on 1536 × 1536 patches with a sliding window. Finally, we transfer the facial shape
and consistently inferred reflectance to a head model. Both face and head can be rendered realistically in any environment.

estimating separated diffuse and specular albedo and pho- 2.3. Facial Geometry Estimation
tometric normals using color-space analysis. The method
has the advantage of not requiring polarization and hence Over the years, numerous methods have been introduced
requires half the number of photographs compared to po- in the literature that tackle the problem of 3D facial recon-
larized spherical gradients and enables completely view- struction from a single input image. Early methods required
independent reflectance separation, making it faster and a statistical 3DMM both for shape and appearance, usually
more robust for high quality facial capture [24]. encoded in a low dimensional space constructed by PCA
Passive multiview facial capture has also made signifi- [3, 4]. Lately, many approaches have tried to leverage the
cant progress in recent years, from high quality facial ge- power of Convolutional Neural Networks (CNNs) to either
regress the latent parameters of a PCA model [38, 7] or uti-
ometry capture [2] to even detailed facial appearance esti-
mation [17]. However, the quality of the acquired data with lize a 3DMM to synthesize images and formulate an image-
such passive capture methods is somewhat lower compared to-image translation problem using CNNs [18, 31].
to active illumination techniques.
2.4. Photorealistic 3D faces with Deep Learning
In this work, we employ two state-of-the-art active illu-
mination based multiview facial capture methods [15, 24] Many approaches have been successful in acquiring the
for acquiring high quality facial reflectance data in order to reflectance of materials from a single image, using deep net-
build our training data. works with an encoder-decoder architecture [12, 25, 26].
However, they only explore 2D surfaces and in a constrained
2.2. Image-to-Image Translation environment, usually assuming a single point-light source.
Image-to-image translation refers to the task of translat- Early applications on human faces [34, 35] used im-
ing an input image to a designated target domain (e.g., turn- age translation networks to infer facial reflection from an
ing sketches into images, or day into night scenes). With “in-the-wild” image, producing low-resolution results. Re-
the introduction of GANs [16], image-to-image translation cent approaches attempt to incorporate additional facial nor-
improved dramatically [21, 45]. Recently, with the increas- mal and displacement mappings resulting in representations
ing capabilities in the hardware, image-to-image translation with high frequency details [6]. Although this method
has also been successfully attempted in high-resolution data demonstrates impressive results in geometry inference, it
[40]. In this work we utilize variations of pix2pixHD [40] tends to fail in conditions with harsh illumination and ex-
to carry out tasks such as de-lighting and the extraction of treme head poses, and does not produce re-lightable results.
reflectance maps in very high-resolution. Saito et al. [32] proposed a deep learning approach for data-
driven inference of high resolution facial texture map of an shown in Fig. 3. We name the dataset RealFaceDB. It is
entire face for realistic rendering, using an input of a sin- currently the largest dataset of this type and we intend to
gle low-resolution face image with partial facial coverage. make it publicly available to the scientific community 1 .
This has been extended to inference of facial mesostruc-
ture, given a diffuse albedo [20], and even complete facial 4. Method
reflectance and displacement maps besides albedo texture,
given partial facial image as input [42]. While closest to
our work, these approaches achieve the creation of digital
avatars, rather than high quality facial appearance estima-
tion from “in-the-wild” images. In this work, we try to over-
come these limitations by employing an iterative optimiza-
tion framework as proposed in [14]. This optimization strat-
egy leverages a deep face recognition network and GANs
into a conventional fitting method in order to estimate the (a) Input
high-quality geometry and texture with fine identity char-
acteristics, which can then be used to produce high-quality (b) Diff Alb (c) Spec Alb (d) Diff Nor (e) Spec Nor
reflectance maps.
Figure 4: Rendered patch ([14]-like) of a subject acquired
3. Training Data with [15], ground truth maps (top-row) and predictions with
our network given rendering as input (bottom-row).
3.1. Ground Truth Acquisition
To achieve photorealistic rendering of the human skin,
we separately model the diffuse and specular albedo and
normals of the desired geometry. Therefore, given a single
unconstrained face image as input, we infer the facial ge-
ometry as well as the diffuse albedo (AD ), diffuse normals
(ND ) 2 , specular albedo (AS ), and specular normals (NS ).
As seen in Fig. 2, we first reconstruct a 3D face (base
(a) Diff. Alb. (b) Spec. Alb. (c) Diff. Nor. (d) Spec. Nor. geometry with texture) from a single image at a low res-
olution using an existing 3DMM algorithm [5]. Then, the
Figure 3: Two subjects’ reflectance acquired with [15] (top) reconstructed texture map, which contains baked illumina-
and [22, 24] (bottom). Specular normals in tangent space. tion, is enhanced by a super resolution network, followed
by a de-lighting network to obtain a high resolution diffuse
We employ the state-of-the-art method of [15] for cap- albedo AD . Finally, we infer the other three components
turing high resolution pore-level reflectance maps of faces (AS , ND , NS ) from the diffuse albedo AD in conjunction
using a polarized LED sphere with 168 lights (partitioned with the base geometry. The following sections explain
into two polarization banks) and 9 DSLR cameras. Half the these steps in detail.
LED s on the sphere are vertically polarized (for parallel po-
larization), and the other half are horizontally polarized (for 4.1. Initial Geometry and Texture Estimation
cross-polarization) in an interleaved pattern. Our method requires a low-resolution 3D reconstruction
Using the LED sphere, we can also employ the color- of a given face image I. Therefore, we begin with the esti-
space analysis from unpolarised LEDs [22] for diffuse- mation of the facial shape with n vertices S ∈ Rn×3 and
specular separation and the multi-view facial capture texture T ∈ R576×384×3 by borrowing any state-of-the-
method of [24] to acquire unwrapped textures of similar art 3D face reconstruction approach (we use GANFIT [14]).
quality (Fig. 3). This method requires less than half of data Apart from the usage of deep identity features, GANFIT syn-
captured (hence reduced capture time) and a simpler setup thesizes realistic texture UV maps using a GAN as a statisti-
(no polarizers), enabling the acquisition of larger datasets. cal representation of the facial texture. We reconstruct the
initial base shape and texture of the input image I as follows
3.2. Data Collection
1 For the dataset and other materials we refer the reader to the project’s
In this work, we capture faces of over 200 individuals
page https://github.com/lattas/avatarme.
of different ages and characteristics under 7 different ex- 2 The diffuse normals N
D are not usually used in commercial ren-
pressions. The geometry reconstructions are registered to a dering systems. By inferring ND we can model the reflection as in the
standard topology, like in [5], with unwrapped textures as state-of-the-art specular-diffuse separation techniques [15, 24].
and refer the reader to [14] for further details: training of [14], while also having accurate ground truth of
their albedo and normals. We compute a physically-based
T, S = G(I) (1) rendering for each subject from all view-points, using the
predicted environment map and the predicted light sources
where G : Rk×m×3 7→ R576×384×3 , Rn×3 denotes the with a random variation of their position, creating an illumi-
GANFIT reconstruction method for an Rk×m×3 arbitrary
nated texture map. We denote this whole simulation process
sized image, and n number of vertices on a fixed topology. by ξ : AD ∈ R6144×4096×3 7→ AT 6144×4096×3
D ∈R which
Having acquired the prerequisites, we procedurally im- translates diffuse albedo to the distribution of the textures
prove on them: from the reconstructed geometry S, we ac- with baked illumination, as shown in the following:
quire the shape normals N and enhance the facial texture
T resolution, before using them to estimate the components AT
D = ξ(AD ) ∼ Et∈{T1 ,T2 ,...,Tn } t (3)
for physically based rendering, such as the diffuse and spec-
ular diffuse and normals. 4.3.2 Training the De-lighting Network

4.2. Super-resolution Given the simulated illumination as explained in Sec. 4.3.1,

we now have access to a version of RealFaceDB with the
Although the texture T ∈ R576×384×3 from GANFIT [14]-like illumination AT D and with the corresponding dif-
[14] has reasonably good quality, it is below par com- fuse albedo AD . We formulate de-lighting as a domain
pared to artist-made render-ready 3D faces. To remedy adaptation problem and train an image-to-image translation
that, we employ a state-of-the-art super-resolution network, network. To do this, we follow two strategies different from
RCAN [43], to increase the resolution of the UV maps from
the standard image translation approaches.
T ∈ R576×384×3 to T̂ ∈ R4608×3072×3 , which is then re- Firstly, we find that the occlusion of illumination on the
topologized and up-sampled to R6144×4096 . Specifically, skin surface is geometry-dependent and thus the resulting
we train a super-resolution network (ζ : R48×48×3 7→ albedo improves in quality when feeding the network with
R384×384×3 ) with the texture patches of the acquired low- both the texture and geometry of the 3DMM. To do so, we
resolution texture T. At the test time, the whole texture simply normalize the texture AT D channels to [−1, 1] and
from GANFIT T is upscaled by the following: concatenate them with the depth of the mesh in object space
T̂ = ζ(T) (2) DO , also in [−1, 1]. The depth (DO ) is defined as the Z
dimension of the vertices of the acquired and aligned ge-
4.3. Diffuse Albedo Extraction by De-lighting ometries, in a UV map. We feed the network with a 4D ten-
sor of [AT T T
DR , ADG , ADB , DO ] and predict the resulting 3-
A significant issue of the texture T produced by 3DMMs channel albedo [ADR , ADG , ADB ]. Alternatively, we can
is that they are trained on data with baked illumination also use as an input the texture AT D concatenated with the
(i.e. reflection, shadows), which they reproduce. GANFIT- normals in object space (NO ). We found that feeding the
produced textures contain sharp highlights and shadows, network only with the texture map causes artifacts in the in-
made by strong point-light sources, as well as baked en- ference. Secondly, we split the original high resolution data
vironment illumination, which prohibits photorealistic ren- into overlapping patches of 512 × 512 pixels in order to
dering. In order to alleviate this problem, we first model the augment the number of data samples and avoid overfitting.
illumination conditions of the dataset used in [14] and then In order to remove existing illumination from T̂, we
synthesize UV maps with the same illumination in order train an image-to-image translation network with patches
to train an image-to-image translation network from texture δ : AT 512×512×3
D , DO 7→ AD ∈ R and then extract the
with baked-illumination to unlit diffuse albedo AD . Further diffuse albedo AD by the following:
details are explained in the following sections.
AD = δ(T̂, DO ) (4)
4.3.1 Simulating Baked Illumination 4.4. Specular Albedo Extraction
Firstly, we acquire random texture and mesh outputs from Background: Predicting the entire specular BRDF and the
GANFIT . Using a cornea model [28], we estimate the av- per-pixel specular roughness from the illuminated texture
erage direction of the apparent 3 point light sources used, T̂ or the inferred diffuse albedo AD , poses an unnecessary
with respect to the subject, and an environment map for the challenge. As shown in [15, 22] a subject can be realisti-
textures T. The environment map produces a good esti- cally rendered using only the intensity of the specular re-
mation of the environment illumination of GANFIT’s data flection AS , which is consistent on a face due to the skins
while the 3 light sources help to simulate the highlights and refractive index. The spatial variation is correlated to facial
shadows. Thus, we render our acquired 200 subjects (Sec- skin structures such as skin pores, wrinkles or hair, which
tion 3), as if they were samples from the dataset used in the act as reflection occlusions reducing the specular intensity.
Methodology: In principle, the specular albedo can also space NT to the specular normals NS . The specular nor-
be computed from the texture with the baked illumination, mals are extracted by the following:
since the texture includes baked specular reflection. How-
ever, we empirically found that the specular component is NS = ρ(Agray
D , NT ) (6)
strongly biased due to the environment illumination and oc-
clusion. Having computed a high quality diffuse albedo AD 4.6. Diffuse Normals Extraction
from the previous step, we infer the specular albedo AS by Background: The diffuse normals are highly correlated
a similar patch-based image-to-image translation network with the shape normals, as diffusion is scattered uniformly
from the diffuse albedo (ψ : AD 7→ AS ∈ R512×512×3 ) across the skin. Scars and wrinkles alter the distribution of
trained on RealFaceDB: the diffusion and some non-skin features such as hair that
do not exhibit significant diffusion.
AS = ψ(AD ) (5)
The results (Figs. 4a, 4d) show how the network differ- Methodology : Similarly to the previous section, we train
entiates the intensity between hair and skin, while learning a network σ : Agray
D , NO 7→ ND ∈ R512×512×3 to map
the high-frequency variation that occurs from the pore oc- the concatenation of the grayscale diffuse albedo Agray
D and
clusion of specular reflection. the shape normals in object space NO to the diffuse normals
ND . The diffuse normals are extracted as:
4.5. Specular Normals Extraction
Background: The specular normals exhibit sharp sur- ND = σ(Agray
D , NO ) (7)
face details, such as fine wrinkles and skin pores, and are
Finally, the inferred normals can be used to enhance the
challenging to estimate, as the appearance of some high-
reconstructed geometry, by refining its features and adding
frequency details is dependent on the lighting conditions
plausible details. We integrate over the specular normals in
and viewpoint of the texture. Previous works fail to pre-
tangent space and produce a displacement map which can
dict high-frequency details [6], or rely on separating the
then be embossed on a subdivided base geometry.
mid- and high-frequency information in two separate maps,
as a generator network may discard the high-frequency as
noise [42]. Instead, we show that it is possible to employ
5. Experiments
an image-to-image translation network with feature match- 5.1. Implementation Details
ing loss on a large high-resolution training dataset, which
produces more detailed and accurate results. 5.1.1 Patch-Based Image-to-image translation

Methodology: Similarly to the process for the specular

albedo, we prefer the diffuse albedo over the reconstructed
texture map T̂, as the latter includes sharp highlights that
get wrongly interpreted as facial features by the network.
Moreover, we found that even though the diffuse albedo (a) Input (b) Recon. (c) S.R. (d) Delight (e) Final
is stripped from specular reflection, it contains the facial
skin structures that define mid- and high-frequency details, Figure 5: Rendering after (b) base reconstruction, (c) super
such as pores and wrinkles. Finally, since the facial features resolution, (d) de-lighting, (e) final result.
are similarly distributed across the color channels, we found
that instead of the diffuse albedo AD , we can use the luma- The tasks of de-lighting, as well as inferring the diffuse
transformed (in sRGB) grayscale diffuse albedo (AgrayD ). and specular components from a given input image (UV)
Again, we found that the network successfully generates can be formulated as domain adaptation problems. As a re-
both the mid- and high-frequency, when it receives as in- sult, to carry out the aforementioned tasks the model of our
put the detailed diffuse albedo AD together with the lower- choice is pix2pixHD [40], which has shown impressive re-
resolution geometry information (in this case, the shape sults in image-to-image translation on high-resolution data.
normals). Moreover, the resulting high-frequency details Nevertheless, as discussed previously: (a) our captured
are more accentuated, when using normals in tangent space data are of very high-resolution (more than 4K) and thus
(NT ), which also serve as a better output, since most com- cannot be used for training “as-is” utilizing pix2pixHD, due
mercial applications require the normals in tangent space. to hardware limitations (note not even on a 32GB GPU we
We train a translation network ρ : Agray D , NT 7→ NS , can fit such high-resolution data in their original format), (b)
∈ R512×512×3 to map the concatenation of the grayscale pix2pixHD [40] takes into account only the texture infor-
diffuse albedo AgrayD and the shape normals in tangent mation and thus geometric details, in the form of the shape
(a) Input (b) Diff Alb (c) Spec Alb (d) Norm (e) Render

Figure 7: Consistency of our algorithm on varying lighting

conditions. Input images from the Digital Emily Project [1].

inputs. As mentioned earlier, this substantially improves the

results by accentuating the details in the translated outputs.

5.2. Evaluation
We conduct quantitative as well as qualitative compar-
isons against the state-of-the-art. For the quantitative com-
parisons, we utilize the widely used PSNR metric [19], and
(a) Input (b) Cathedral (c) Sunset (d) Tunnel report the results in Table 1. As can be seen, our method
outperforms [6] and [42] by a significant margin. Moreover
Figure 6: Reconstructions of our method re-illuminated un- using a state-of-the-art face recognition algorithm [11], we
der different environment maps [10] with added spot lights. also find the highest match of facial identity compared to
the input images when using our method. The input images
normals and depth cannot be exploited to improve the qual-
were compared against renderings of the faces with recon-
ity of the generated diffuse and specular components.
structed geometry and reflectance, including eyes.
To alleviate the aforementioned shortcomings, we: (a)
split the original high-resolution data into smaller patches For the qualitative comparisons, we perform 3D recon-
of 512 × 512 size. More specifically, using a stride of size structions of “in-the-wild” images. As shown in Figs. 8 and
256, we derive the partially overlapping patches by pass- 9, our method does not produce any artifacts in the final
ing through each original UV horizontally as well as verti- renderings and successfully handles extreme poses and oc-
cally, (b) for each translation task, we utilize the shape nor- clusions such as sunglasses. We infer the texture maps in a
mals, concatenate them channel-wise with the correspond- patch-based manner from high-resolution input, which pro-
ing grayscale texture input (e.g., in the case of translating duces higher-quality details than [6, 42], who train on high-
the diffuse albedo to the specular normals, we concate- quality scans but infer the maps for the whole face, in lower
nate the grayscale diffuse albedo with the shape normals resolution. This is also apparent in Fig. 5, which shows our
channel-wise) and thus feed a 4D tensor ([G, X, Y, Z]) to reconstruction after each step of our process. Moreover, we
the network. This increases the level of detail in the derived can successfully acquire each component from black-and-
outputs as the shape normals act as a geometric “guide”. white images (Fig. 9) and even drawn portraits (Fig. 8).
Note that during inference that patch size can be larger Furthermore, we experiment with different environment
(e.g. 1536×1536), since the network is fully-convolutional. conditions, in the input images and while rendering. As pre-
sented in Fig. 7, the extracted normals, diffuse and specular
5.1.2 Training Setup
albedos are consistent, regardless of the illumination on the
To train RCAN [43], we use the default hyper-parameters. original input images. Finally, Fig. 6 shows different sub-
For the rest of the translation of models, we use a custom jects rendered under different environments. We can realis-
translation network as described earlier, which is based on tically illuminate each subject in each scene and accurately
pix2pixHD [40]. More specifically, we use 9 and 3 residual reconstruct the environment reflectance, including detailed
blocks in the global and local generators, respectively. The specular reflections and subsurface scattering.
learning rate we employed is 0.0001, whereas the Adam be- In addition to the facial mesh, we are able to infer the
tas are 0.5 for β1 and 0.999 for β2 . Moreover, we do not use entire head topology based on the Universal Head Model
the VGG features matching loss as this slightly deteriorated (UHM) [29, 30]. We project our facial mesh to a subspace,
the performance. Finally, we use as inputs 3 and 4 channel regress the head latent parameters and then finally derive
tensors which include the shape normals NO or depth DO the completed head model with completed textures. Some
together with the RGB AD or grayscale AgrayD values of the qualitative head completion results can be seen in Figs 1, 2.
(a) Input (b) Tex. [6] (c) Nor. [6] (d) Alb. [42] (e) S.A. [42] (f) Ours D.A. (g) Ours S.A. (h) Ours S.N.

Figure 8: Comparison of reflectance maps predicted by our method against state-of-the-art methods. [42] reconstruction
is provided by the authors and [6] from their open-sourced models. Last column is cropped to better show the details.

does not perform that well when we reconstruct faces of

e.g. darker skin subjects. Also, the reconstructed specular
albedo and normals exhibit slight blurring of some high fre-
quency pore details due to minor alignment errors of the ac-
quired data to the template 3DMM model. Finally, the accu-
racy of facial reconstruction is not completely independent
of the quality of the input photograph, and well-lit, higher
resolution photographs produce more accurate results.

(a) Input (b) [42] (c) [6] (d) Ours 6. Conclusion

Figure 9: Qualitative comparison of reconstructions of a In this paper, we propose the first methodology that
subject from “in-the-wild“ images, rendered in the Grace produces high-quality rendering-ready face reconstructions
Cathedral environment [10]. [42] reconstructions provided from arbitrary “in-the-wild” images. We build upon re-
by the authors and [6] from their open-sourced models. cently proposed 3D face reconstruction techniques and train
image translation networks that can perform estimation of
high quality (a) diffuse and specular albedo, and (b) diffuse
Algorithm [42] [6] Ours
and specular normals. This is made possible with a large
PSNR (Albedo) 11.225 14.374 24.05
training dataset of 200 faces acquired with high quality fa-
PSNR (Normals) 21.889 17.321 26.97
cial capture techniques. We demonstrate that it is possible
Rendered ID Match [11] 0.632 0.629 0.873
to produce rendering-ready faces from arbitrary face images
varying in pose, occlusions, etc., including black-and-white
Table 1: Average PSNR computed for a single subject be-
and drawn portraits. Our results exhibit unprecedented level
tween 6 reconstructions of the same subject from “in-the-
of detail and realism in the reconstructions, while preserv-
wild“ images and the ground truth captures with [24]. We
ing the identity of subjects in the input photographs.
transform [6, 42] results to our UV topology and compute
only for a 2K × 2K centered crop, as they only produced
the frontal part of the face and manually add eyes to [42]. Acknowledgements
AL was supported by EPSRC Project DEFORM
5.3. Limitations (EP/S010203/1) and SM by an Imperial College FATA.
AG acknowledges funding by the EPSRC Early Career
While our dataset contains a relatively large number of Fellowship (EP/N006259/1) and SZ from a Google Fac-
subjects, it does not contain sufficient examples of sub- ulty Fellowship and the EPSRC Fellowship DEFORM
jects from certain ethnicities. Hence, our method currently (EP/S010203/1).
References [13] Baris Gecer, Alexander Lattas, Stylianos Ploumpis, Jiankang
Deng, Athanasios Papaioannou, Stylianos Moschoglou, and
[1] Oleg Alexander, Mike Rogers, William Lambeth, Jen-Yuan Stefanos Zafeiriou. Synthesizing coupled 3d face modali-
Chiang, Wan-Chun Ma, Chuan-Chang Wang, and Paul De- ties by trunk-branch generative adversarial networks. arXiv
bevec. The digital emily project: Achieving a photorealistic preprint arXiv:1909.02215, 2019. 1
digital actor. IEEE Computer Graphics and Applications,
[14] Baris Gecer, Stylianos Ploumpis, Irene Kotsia, and Stefanos
30(4):20–31, 2010. 7
Zafeiriou. Ganfit: Generative adversarial network fitting for
[2] Thabo Beeler, Bernd Bickel, Paul Beardsley, Bob Sum- high fidelity 3d face reconstruction. In Proceedings of the
ner, and Markus Gross. High-quality single-shot capture IEEE Conference on Computer Vision and Pattern Recogni-
of facial geometry. ACM Transactions on Graphics (TOG), tion, pages 1155–1164, 2019. 1, 2, 4, 5
29(3):40:1–40:9, 2010. 3 [15] Abhijeet Ghosh, Graham Fyffe, Borom Tunwattanapong, Jay
[3] Volker Blanz, Thomas Vetter, et al. A morphable model for Busch, Xueming Yu, and Paul Debevec. Multiview face
the synthesis of 3d faces. In Siggraph, volume 99, pages capture using polarized spherical gradient illumination. In
187–194, 1999. 1, 3 ACM Transactions on Graphics (TOG), volume 30, page
[4] James Booth, Epameinondas Antonakos, Stylianos 129. ACM, 2011. 2, 3, 4, 5
Ploumpis, George Trigeorgis, Yannis Panagakis, and [16] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing
Stefanos Zafeiriou. 3d face morphable models” in-the- Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and
wild”. In 2017 IEEE Conference on Computer Vision and Yoshua Bengio. Generative adversarial nets. In Advances
Pattern Recognition (CVPR), pages 5464–5473. IEEE, in neural information processing systems, pages 2672–2680,
2017. 2, 3 2014. 3
[5] James Booth, Anastasios Roussos, Stefanos Zafeiriou, Allan [17] Paulo Gotardo, Jérémy Riviere, Derek Bradley, Abhijeet
Ponniah, and David Dunaway. A 3d morphable model learnt Ghosh, and Thabo Beeler. Practical dynamic facial appear-
from 10,000 faces. In Proceedings of the IEEE Conference ance modeling and acquisition. ACM Trans. Graph., 37(6),
on Computer Vision and Pattern Recognition, pages 5543– Dec. 2018. 3
5552, 2016. 4 [18] Yudong Guo, Jianfei Cai, Boyi Jiang, Jianmin Zheng, et al.
[6] Anpei Chen, Zhang Chen, Guli Zhang, Kenny Mitchell, and Cnn-based real-time dense face reconstruction with inverse-
Jingyi Yu. Photo-realistic facial details synthesis from single rendered photo-realistic face images. IEEE transactions on
image. In The IEEE International Conference on Computer pattern analysis and machine intelligence, 41(6):1294–1307,
Vision (ICCV), October 2019. 2, 3, 6, 7, 8 2018. 3
[7] Forrester Cole, David Belanger, Dilip Krishnan, Aaron [19] Alain Hore and Djemel Ziou. Image quality metrics: Psnr
Sarna, Inbar Mosseri, and William T Freeman. Synthesizing vs. ssim. In 2010 20th International Conference on Pattern
normalized faces from facial identity features. In Proceed- Recognition, pages 2366–2369. IEEE, 2010. 7
ings of the IEEE Conference on Computer Vision and Pattern [20] Loc Huynh, Weikai Chen, Shunsuke Saito, Jun Xing, Koki
Recognition, pages 3703–3712, 2017. 3 Nagano, Andrew Jones, Paul Debevec, and Hao Li. Meso-
[8] Navneet Dalal and Bill Triggs. Histograms of oriented gra- scopic facial geometry inference using deep neural networks.
dients for human detection. 2005. 2 In Proceedings of the IEEE Conference on Computer Vision
[9] Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter and Pattern Recognition, pages 8407–8416, 2018. 2, 4
Duiker, Westley Sarokin, and Mark Sagar. Acquiring the [21] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A
reflectance field of a human face. In Proceedings of the Efros. Image-to-image translation with conditional adver-
27th annual conference on Computer graphics and inter- sarial networks. In Proceedings of the IEEE conference on
active techniques, pages 145–156. ACM Press/Addison- computer vision and pattern recognition, pages 1125–1134,
Wesley Publishing Co., 2000. 2 2017. 3
[10] Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter [22] Christos Kampouris, Stefanos Zafeiriou, and Abhijeet
Duiker, Westley Sarokin, and Mark Sagar. Acquiring the Ghosh. Diffuse-specular separation using binary spherical
reflectance field of a human face. In Proceedings of the gradient illumination. In EGSR (EI&I), pages 1–10, 2018. 2,
27th annual conference on Computer graphics and inter- 4, 5
active techniques, pages 145–156. ACM Press/Addison- [23] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen.
Wesley Publishing Co., 2000. 7, 8 Progressive growing of gans for improved quality, stability,
[11] Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos and variation. arXiv preprint arXiv:1710.10196, 2017. 2
Zafeiriou. Arcface: Additive angular margin loss for deep [24] Alexander Lattas, Mingqian Wang, Stefanos Zafeiriou, and
face recognition. In Proceedings of the IEEE Conference Abhijeet Ghosh. Multi-view facial capture using binary
on Computer Vision and Pattern Recognition, pages 4690– spherical gradient illumination. In ACM SIGGRAPH 2019
4699, 2019. 2, 7, 8 Posters, page 59. ACM, 2019. 3, 4, 8
[12] Valentin Deschaintre, Miika Aittala, Fredo Durand, George [25] Xiao Li, Yue Dong, Pieter Peers, and Xin Tong. Model-
Drettakis, and Adrien Bousseau. Single-image svbrdf cap- ing surface appearance from a single photograph using self-
ture with a rendering-aware deep network. ACM Transac- augmented convolutional neural networks. ACM Transac-
tions on Graphics (ToG), 37(4):1–15, 2018. 3 tions on Graphics (TOG), 36(4):1–11, 2017. 3
[26] Zhengqin Li, Kalyan Sunkavalli, and Manmohan Chan- phable models with a very deep neural network. In Proceed-
draker. Materials for masses: Svbrdf acquisition with a ings of the IEEE Conference on Computer Vision and Pattern
single mobile phone image. In Proceedings of the Euro- Recognition, pages 5163–5172, 2017. 3
pean Conference on Computer Vision (ECCV), pages 72–87, [39] Mengjiao Wang, Zhixin Shu, Shiyang Cheng, Yannis Pana-
2018. 3 gakis, Dimitris Samaras, and Stefanos Zafeiriou. An ad-
[27] Wan-Chun Ma, Tim Hawkins, Pieter Peers, Charles-Felix versarial neuro-tensorial approach for learning disentangled
Chabert, Malte Weiss, and Paul Debevec. Rapid acquisition representations. International Journal of Computer Vision,
of specular and diffuse normal maps from polarized spheri- 127(6-7):743–762, 2019. 2
cal gradient illumination. In Proceedings of the 18th Euro- [40] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao,
graphics conference on Rendering Techniques, pages 183– Jan Kautz, and Bryan Catanzaro. High-resolution image syn-
194. Eurographics Association, 2007. 2 thesis and semantic manipulation with conditional gans. In
[28] Ko Nishino and Shree K Nayar. Eyes for relighting. ACM Proceedings of the IEEE conference on computer vision and
Transactions on Graphics (TOG), 23(3):704–711, 2004. 5 pattern recognition, pages 8798–8807, 2018. 3, 6, 7
[29] Stylianos Ploumpis, Evangelos Ververas, Eimear O’ Sul- [41] Tim Weyrich, Wojciech Matusik, Hanspeter Pfister, Bernd
livan, Stylianos Moschoglou, Haoyang Wang, Nick Pears, Bickel, Craig Donner, Chien Tu, Janet McAndless, Jinho
William AP Smith, Baris Gecer, and Stefanos Zafeiriou. To- Lee, Addy Ngan, Henrik Wann Jensen, and Markus Gross.
wards a complete 3d morphable model of the human head. Analysis of human faces using a measurement-based skin
arXiv preprint arXiv:1911.08008, 2019. 7 reflectance model. ACM Transactions on Graphics (TOG),
[30] Stylianos Ploumpis, Haoyang Wang, Nick Pears, 25(3):1013–1024, July 2006. 2
William AP Smith, and Stefanos Zafeiriou. Combin- [42] Shuco Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie
ing 3d morphable models: A large scale face-and-head Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, and
model. In Proceedings of the IEEE Conference on Computer Hao Li. High-fidelity facial reflectance and geometry infer-
Vision and Pattern Recognition, pages 10934–10943, 2019. ence from an unconstrained image. ACM Transactions on
7 Graphics (TOG), 37(4):162, 2018. 2, 4, 6, 7, 8
[31] Elad Richardson, Matan Sela, and Ron Kimmel. 3d face re- [43] Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng
construction by learning from synthetic data. In 2016 Fourth Zhong, and Yun Fu. Image super-resolution using very deep
International Conference on 3D Vision (3DV), pages 460– residual channel attention networks. In Proceedings of the
469, 2016. 3 European Conference on Computer Vision (ECCV), pages
[32] Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, and 286–301, 2018. 5, 7
Hao Li. Photorealistic facial texture inference using deep [44] Yuxiang Zhou, Jiankang Deng, Irene Kotsia, and Stefanos
neural networks. In Proceedings of the IEEE Conference Zafeiriou. Dense 3d face decoding over 2500fps: Joint tex-
on Computer Vision and Pattern Recognition, pages 5144– ture & shape convolutional mesh decoders. In Proceedings
5153, 2017. 2, 3 of the IEEE Conference on Computer Vision and Pattern
[33] Matan Sela, Elad Richardson, and Ron Kimmel. Unre- Recognition, pages 1097–1106, 2019. 1, 2
stricted facial geometry reconstruction using image-to-image [45] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A
translation. In Proceedings of the IEEE International Con- Efros. Unpaired image-to-image translation using cycle-
ference on Computer Vision, pages 1576–1585, 2017. 1, 2 consistent adversarial networks. In Proceedings of the IEEE
[34] Soumyadip Sengupta, Angjoo Kanazawa, Carlos D Castillo, international conference on computer vision, pages 2223–
and David W Jacobs. Sfsnet: Learning shape, reflectance 2232, 2017. 3
and illuminance of facesin the wild’. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recogni-
tion, pages 6296–6305, 2018. 3
[35] Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli,
Eli Shechtman, and Dimitris Samaras. Neural face editing
with intrinsic image disentangling. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recogni-
tion, pages 5541–5550, 2017. 2, 3
[36] Ayush Tewari, Michael Zollhöfer, Pablo Garrido, Florian
Bernard, Hyeongwoo Kim, Patrick Pérez, and Christian
Theobalt. Self-supervised multi-level face model learning
for monocular reconstruction at over 250 hz. In Proceed-
ings of the IEEE Conference on Computer Vision and Pattern
Recognition, pages 2549–2559, 2018. 1, 2
[37] Luan Tran and Xiaoming Liu. On learning 3d face mor-
phable model from in-the-wild images. IEEE transactions
on pattern analysis and machine intelligence, 2019. 1, 2
[38] Anh Tuan Tran, Tal Hassner, Iacopo Masi, and Gérard
Medioni. Regressing robust and discriminative 3d mor-

Vray For Sketchup User Guide
100% (1)
Vray For Sketchup User Guide
33 pages
Chapter 06 State Management and Drawing Geometric Objects PDF
0% (1)
Chapter 06 State Management and Drawing Geometric Objects PDF
5 pages
Creative Tech Q1 W7
0% (1)
Creative Tech Q1 W7
10 pages
David DeAngelo - Double Your Dating 2nd Edition
50% (2)
David DeAngelo - Double Your Dating 2nd Edition
16 pages
"Animated Rainbow": A Micro Project Report On
100% (1)
"Animated Rainbow": A Micro Project Report On
13 pages
.,simulate A School of Fish in Motion1
No ratings yet
.,simulate A School of Fish in Motion1
93 pages
What Are The Different Types of Photo Editing?
No ratings yet
What Are The Different Types of Photo Editing?
5 pages
3D Graphics Rendering
No ratings yet
3D Graphics Rendering
55 pages
Rendering An Interior Scene
No ratings yet
Rendering An Interior Scene
10 pages
Maya Under Water Lighting
No ratings yet
Maya Under Water Lighting
12 pages
3D Face Reconstruction From 2D Images - A Survey
100% (1)
3D Face Reconstruction From 2D Images - A Survey
7 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Dohoamaytinh
No ratings yet
Dohoamaytinh
23 pages
Lab 1
No ratings yet
Lab 1
14 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
28 pages
A Hierarchical Representation Network For Accurate and Detailed Face Reconstruction From In-The-Wild Images
No ratings yet
A Hierarchical Representation Network For Accurate and Detailed Face Reconstruction From In-The-Wild Images
19 pages
Dey Generating Diverse 3D Reconstructions From A Single Occluded Face Image CVPR 2022 Paper
No ratings yet
Dey Generating Diverse 3D Reconstructions From A Single Occluded Face Image CVPR 2022 Paper
11 pages
3D Face Reconstruction in Deep Learning Era A Survey
No ratings yet
3D Face Reconstruction in Deep Learning Era A Survey
33 pages
PBR Basics Antb1
No ratings yet
PBR Basics Antb1
8 pages
REALY: Rethinking The Evaluation of 3D Face Reconstruction
No ratings yet
REALY: Rethinking The Evaluation of 3D Face Reconstruction
35 pages
Etet 159
No ratings yet
Etet 159
9 pages
Deep Learning For Face Recognition
No ratings yet
Deep Learning For Face Recognition
47 pages
Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing
No ratings yet
Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing
14 pages
Graph Based Synthesis For Skin Micro Wrinkles
No ratings yet
Graph Based Synthesis For Skin Micro Wrinkles
12 pages
Deep Face Recognition: A Survey: Mei Wang, Weihong Deng
No ratings yet
Deep Face Recognition: A Survey: Mei Wang, Weihong Deng
31 pages
Face Recognition From Facial Surface Metric 1st Edition by Alexander Bronstein, Michael Bronstein, Alon Spira, Ron Kimmel ISBN 3540219835 9783540219835 PDF Download
No ratings yet
Face Recognition From Facial Surface Metric 1st Edition by Alexander Bronstein, Michael Bronstein, Alon Spira, Ron Kimmel ISBN 3540219835 9783540219835 PDF Download
50 pages
3D Face Reconstruction Based On A Single Image A Review
No ratings yet
3D Face Reconstruction Based On A Single Image A Review
24 pages
Li 3D-Aware Face Swapping CVPR 2023 Paper
No ratings yet
Li 3D-Aware Face Swapping CVPR 2023 Paper
10 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
3D Face Papers
No ratings yet
3D Face Papers
12 pages
Thesis - Changsheng Revised - Final Format Approved LW 10-22-15
No ratings yet
Thesis - Changsheng Revised - Final Format Approved LW 10-22-15
85 pages
3D Face Reconstruction Based On Convolutional Neural Network
No ratings yet
3D Face Reconstruction Based On Convolutional Neural Network
4 pages
3D Face Reconstruction by Learning From Synthetic Data
No ratings yet
3D Face Reconstruction by Learning From Synthetic Data
8 pages
Physically-Based Face Rendering For NIR-VIS Face Recognition
No ratings yet
Physically-Based Face Rendering For NIR-VIS Face Recognition
12 pages
Neural Head Avatars From Monocular RGB Videos
No ratings yet
Neural Head Avatars From Monocular RGB Videos
18 pages
Eigenharmonics Faces Face Recognition Under Generic Lighting
No ratings yet
Eigenharmonics Faces Face Recognition Under Generic Lighting
6 pages
Wang 2021
No ratings yet
Wang 2021
5 pages
Micro Project Sample Word
No ratings yet
Micro Project Sample Word
21 pages
A SFM-based Sparse To Dense 3D Face Reconstruction Method Robust To Feature Tracking Errors
No ratings yet
A SFM-based Sparse To Dense 3D Face Reconstruction Method Robust To Feature Tracking Errors
5 pages
Face Recognition With Local Binary Patterns
No ratings yet
Face Recognition With Local Binary Patterns
14 pages
Anappearance Based Photorealistic Model For Multipile Facial
No ratings yet
Anappearance Based Photorealistic Model For Multipile Facial
13 pages
Let There Be Color! Large-Scale Texturing of 3D Reconstructions
No ratings yet
Let There Be Color! Large-Scale Texturing of 3D Reconstructions
15 pages
Adaptive Deep Supervised Autoencoder Based Image R PDF
No ratings yet
Adaptive Deep Supervised Autoencoder Based Image R PDF
15 pages
Sample Data For Pivot Table
No ratings yet
Sample Data For Pivot Table
12 pages
Kibret Abebe PDF
No ratings yet
Kibret Abebe PDF
88 pages
Caricatureshop: Personalized and Photorealistic Caricature Sketching
No ratings yet
Caricatureshop: Personalized and Photorealistic Caricature Sketching
12 pages
Ju Complete Face Recovery GAN Unsupervised Joint Face Rotation and De-Occlusion WACV 2022 Paper
No ratings yet
Ju Complete Face Recovery GAN Unsupervised Joint Face Rotation and De-Occlusion WACV 2022 Paper
11 pages
Photo-Realistic Real-Time Face Rendering Semester Project LGG Laboratory, EPFL Daniel Chappuis
No ratings yet
Photo-Realistic Real-Time Face Rendering Semester Project LGG Laboratory, EPFL Daniel Chappuis
42 pages
Burley-2012-Pbs Disney BRDF Notes v2
No ratings yet
Burley-2012-Pbs Disney BRDF Notes v2
26 pages
3D Face Reconstruction With Dense Landmarks: Microsoft
No ratings yet
3D Face Reconstruction With Dense Landmarks: Microsoft
24 pages
Large Pose 3D Face Reconstruction From A Single Image Via Direct Volumetric CNN Regression
No ratings yet
Large Pose 3D Face Reconstruction From A Single Image Via Direct Volumetric CNN Regression
9 pages
Reconstruction of Partially Damaged Facial Image
No ratings yet
Reconstruction of Partially Damaged Facial Image
6 pages
s2012 Pbs Disney BRDF Notes v2
No ratings yet
s2012 Pbs Disney BRDF Notes v2
26 pages
Kim A Lightweight Approach ICCV 2017 Paper
No ratings yet
Kim A Lightweight Approach ICCV 2017 Paper
9 pages
s2012 Pbs Disney BRDF Notes v2 PDF
No ratings yet
s2012 Pbs Disney BRDF Notes v2 PDF
26 pages
Photo-Realistic Facial Details Synthesis From Single Image
No ratings yet
Photo-Realistic Facial Details Synthesis From Single Image
11 pages
Visual Information and Media
No ratings yet
Visual Information and Media
33 pages
Non-Photorealistic Real-Time Rendering of Characteristic Faces
No ratings yet
Non-Photorealistic Real-Time Rendering of Characteristic Faces
9 pages
3D-Face Model Reconstruction Utilizing Facial
No ratings yet
3D-Face Model Reconstruction Utilizing Facial
4 pages
Symmetry 11 01234 v2
No ratings yet
Symmetry 11 01234 v2
9 pages
Pattern Recognition
No ratings yet
Pattern Recognition
12 pages
World's Largest Science, Technology & Medicine Open Access Book Publisher
No ratings yet
World's Largest Science, Technology & Medicine Open Access Book Publisher
24 pages
Digital Image Processing: Interpolation
No ratings yet
Digital Image Processing: Interpolation
8 pages
3-D Face Recognition Using eLBP-Based Facial Description and Local Feature Hybrid Matching
No ratings yet
3-D Face Recognition Using eLBP-Based Facial Description and Local Feature Hybrid Matching
15 pages
Editd Eee
No ratings yet
Editd Eee
7 pages
Robert Dailey
No ratings yet
Robert Dailey
3 pages
3DDGD 3D Deepfake Generation and Detection Using 3D Face Meshes
No ratings yet
3DDGD 3D Deepfake Generation and Detection Using 3D Face Meshes
13 pages
A Near-Infrared Image Based Face Recognition System
No ratings yet
A Near-Infrared Image Based Face Recognition System
6 pages
Ijesat 2012 02 Si 01 11
No ratings yet
Ijesat 2012 02 Si 01 11
4 pages
3D Face Recognition Without Facial Surface Reconstruction
No ratings yet
3D Face Recognition Without Facial Surface Reconstruction
6 pages
2003 Expression-Invariant 3D Face Recognition
No ratings yet
2003 Expression-Invariant 3D Face Recognition
9 pages
GAF 4960378 456 Research Proposal
No ratings yet
GAF 4960378 456 Research Proposal
2 pages
2005 frgc05
No ratings yet
2005 frgc05
7 pages
Making of Portrait of Graham and Amanda by Ian Spriggs - CG TUTORIAL
No ratings yet
Making of Portrait of Graham and Amanda by Ian Spriggs - CG TUTORIAL
1 page
2005 - A Method of 3D Face Recognition Based On Principal Component Analysis Algorithm - in ROI
No ratings yet
2005 - A Method of 3D Face Recognition Based On Principal Component Analysis Algorithm - in ROI
4 pages
Models Vol. 234: Archmodels Vol. 234 Includes 57 Professional, Highly
No ratings yet
Models Vol. 234: Archmodels Vol. 234 Includes 57 Professional, Highly
23 pages
Face Reconstruction - Generic Model
No ratings yet
Face Reconstruction - Generic Model
4 pages
Automatic 3D Face Recognition Combining Global Geometric Features With Local Shape Variation Information
No ratings yet
Automatic 3D Face Recognition Combining Global Geometric Features With Local Shape Variation Information
6 pages
CSS Colors, Images, Links
No ratings yet
CSS Colors, Images, Links
12 pages
Three-Dimensional Model Based Face Recognition: Figure 1. Face Appearance Variations
No ratings yet
Three-Dimensional Model Based Face Recognition: Figure 1. Face Appearance Variations
4 pages
Cga Imp MCQ Fixed Spacing
No ratings yet
Cga Imp MCQ Fixed Spacing
24 pages
Sampe Analisis
No ratings yet
Sampe Analisis
12 pages
Educa Contador (1) (1) - 1-4
No ratings yet
Educa Contador (1) (1) - 1-4
4 pages
Multi-Scale 3D Gaussian Splatting For Anti-Aliased Rendering
No ratings yet
Multi-Scale 3D Gaussian Splatting For Anti-Aliased Rendering
16 pages
CSC566 - A1 - Wan Azra Sofea Batrisyia Binti Jasman
No ratings yet
CSC566 - A1 - Wan Azra Sofea Batrisyia Binti Jasman
5 pages
3rd International Conference On Computer Graphics, Animation & Signal Processing (CGASP 2025)
No ratings yet
3rd International Conference On Computer Graphics, Animation & Signal Processing (CGASP 2025)
3 pages
Pencil Sketch Your Photo Online - Free Tool 6
No ratings yet
Pencil Sketch Your Photo Online - Free Tool 6
1 page
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Color by Number: 2 Brown White 6 Dark Green Dark Blue 5 Light Green 3 Light Blue
No ratings yet
Color by Number: 2 Brown White 6 Dark Green Dark Blue 5 Light Green 3 Light Blue
1 page
Bogost Montfort Hastac
No ratings yet
Bogost Montfort Hastac
14 pages

Avatarme: Realistically Renderable 3D Facial Reconstruction "In-The-Wild"

Uploaded by

Avatarme: Realistically Renderable 3D Facial Reconstruction "In-The-Wild"

Uploaded by

AvatarMe: Realistically Renderable 3D Facial Reconstruction “in-the-wild”

Alexandros Lattas1,2 Stylianos Moschoglou1,2 Baris Gecer1,2 Stylianos Ploumpis1,2

4.2. Super-resolution Given the simulated illumination as explained in Sec. 4.3.1,

Methodology: Similarly to the process for the specular

Figure 7: Consistency of our algorithm on varying lighting

inputs. As mentioned earlier, this substantially improves the

does not perform that well when we reconstruct faces of

(a) Input (b) [42] (c) [6] (d) Ours 6. Conclusion

You might also like