CapHuman: Capture Your Moments in Parallel Universes

Liang, Chao; Ma, Fan; Zhu, Linchao; Deng, Yingying; Yang, Yi

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.00627 (cs)

[Submitted on 1 Feb 2024 (v1), last revised 17 May 2024 (this version, v3)]

Title:CapHuman: Capture Your Moments in Parallel Universes

Authors:Chao Liang, Fan Ma, Linchao Zhu, Yingying Deng, Yi Yang

View PDF HTML (experimental)

Abstract:We concentrate on a novel human-centric image synthesis task, that is, given only one reference facial photograph, it is expected to generate specific individual images with diverse head positions, poses, facial expressions, and illuminations in different contexts. To accomplish this goal, we argue that our generative model should be capable of the following favorable characteristics: (1) a strong visual and semantic understanding of our world and human society for basic object and human image generation. (2) generalizable identity preservation ability. (3) flexible and fine-grained head control. Recently, large pre-trained text-to-image diffusion models have shown remarkable results, serving as a powerful generative foundation. As a basis, we aim to unleash the above two capabilities of the pre-trained model. In this work, we present a new framework named CapHuman. We embrace the "encode then learn to align" paradigm, which enables generalizable identity preservation for new individuals without cumbersome tuning at inference. CapHuman encodes identity features and then learns to align them into the latent space. Moreover, we introduce the 3D facial prior to equip our model with control over the human head in a flexible and 3D-consistent manner. Extensive qualitative and quantitative analyses demonstrate our CapHuman can produce well-identity-preserved, photo-realistic, and high-fidelity portraits with content-rich representations and various head renditions, superior to established baselines. Code and checkpoint will be released at this https URL.

Comments:	Accepted by CVPR 2024. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.00627 [cs.CV]
	(or arXiv:2402.00627v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.00627

Submission history

From: Chao Liang [view email]
[v1] Thu, 1 Feb 2024 14:41:59 UTC (32,035 KB)
[v2] Mon, 19 Feb 2024 11:33:47 UTC (32,852 KB)
[v3] Fri, 17 May 2024 14:40:55 UTC (36,786 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CapHuman: Capture Your Moments in Parallel Universes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CapHuman: Capture Your Moments in Parallel Universes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators