MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Ling, Pengyang; Bu, Jiazi; Zhang, Pan; Dong, Xiaoyi; Zang, Yuhang; Wu, Tong; Chen, Huaian; Wang, Jiaqi; Jin, Yi

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.05338 (cs)

[Submitted on 8 Jun 2024 (v1), last revised 28 Jun 2024 (this version, v3)]

Title:MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Authors:Pengyang Ling, Jiazi Bu, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin

View PDF HTML (experimental)

Abstract:Motion-based controllable text-to-video generation involves motions to control the video generation. Previous methods typically require the training of models to encode motion cues or the fine-tuning of video diffusion models. However, these approaches often result in suboptimal motion generation when applied outside the trained domain. In this work, we propose MotionClone, a training-free framework that enables motion cloning from a reference video to control text-to-video generation. We employ temporal attention in video inversion to represent the motions in the reference video and introduce primary temporal-attention guidance to mitigate the influence of noisy or very subtle motions within the attention weights. Furthermore, to assist the generation model in synthesizing reasonable spatial relationships and enhance its prompt-following capability, we propose a location-aware semantic guidance mechanism that leverages the coarse location of the foreground from the reference video and original classifier-free guidance features to guide the video generation. Extensive experiments demonstrate that MotionClone exhibits proficiency in both global camera motion and local object motion, with notable superiority in terms of motion fidelity, textual alignment, and temporal consistency.

Comments:	17 pages, 12 figures, this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.05338 [cs.CV]
	(or arXiv:2406.05338v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.05338

Submission history

From: Pengyang Ling [view email]
[v1] Sat, 8 Jun 2024 03:44:25 UTC (25,550 KB)
[v2] Wed, 12 Jun 2024 12:57:34 UTC (25,550 KB)
[v3] Fri, 28 Jun 2024 18:08:19 UTC (29,843 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators