I am a Lead Researcher and team manager in the Imaging Algorithm Center of VIVO. Our group is the core algorithm team responsible for advancing the photographic quality in the flagship smartphones with the cutting-edge technologies (3D, AIGC, etc).

I was a Senior Researcher in the Visual Computing Center of Tencent AI Lab between 2021 to 2023.

I was a Postdoctoral Researcher in Stanford University supervised by Prof. Leonidas Guibas between 2019 to 2021.

I obtained my PhD degree in the Computer Science and Technology School of Shandong University at 2019. I was supervised by Prof. Baoquan Chen.

My research focus lies in computational photography, 3D vision, Embodied AI. I have published 40+ papers at the top international conferences such as SIGGRAPH, CVPR, ICCV, including 10+ oral papers.

👩‍🎓🧑‍🎓 Internship at VIVO. If you are interested in the research internship on computational photography, 3DV and Embodied AI, feel free to drop me an email.

🔥 Tech Transfer

VIVO X200 series: High-fidelity generative diffusion models for telephoto image enhancement

VIVO X300 series: Text image enhancement (Chinese/English) with generative diffusion prior

📝 Selected Publications

Equal contribution$^\star$

🧑‍🎨 Applied research for deployed projects

ICLR 2026

LiveMoments: Reselected Key Photo Restoration in Live Photos via Reference-guided Diffusion

Clara Xue, Zizheng Yan, Zhenning Shi, Yuhang Yu, Jingyu Zhuang, Qi Zhang, Jinwei Chen, Qingnan Fan.

The first to address the problem of reselected key photo restoration in Live Photos.
LiveMoments significantly improves perceptual quality and fidelity over existing solutions, including the recent flagships from vivo and iPhone.

ICCV 2025

BokehDiff: Neural Lens Blur with One-Step Diffusion

Chengxuan Zhu, Qingnan Fan, Qi Zhang, Jinwei Chen, Huaqi Zhang, Chao Xu, Boxin Shi.

arXiv / codes

The first neural lens blur rendering pipeline based on pretrained diffusion priors.
A diffusion framework with only one inference step that achieves outstanding quality compared with previous methods, especially in regions where depth prediction fails.

CVPR 2025

TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution

Linwei Dong*, Qingnan Fan*, Yihong Guo, Zhonghao Wang, Qi Zhang, Jinwei Chen, Yawei Luo, Changqing Zou.

arXiv / codes

The first image super-resolution work that leverages the pretrained diffusion transformer (DIT) prior, specifically Stable Diffusion 3.
TSD-SR has superior restoration results (most of the metrics perform the best) and the fastest inference speed (e.g. 40 times faster than SeeSR) compared to the past Real-ISR approaches based on pre-trained diffusion priors.

NeurIPS 2025

Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders

Qiming Hu, Linlong Fan, Yiyan Luo, Yuhang Yu, Xiaojie Guo, Qingnan Fan.

arXiv / codes

The first full-image text super-resolution work utilizing the diffusion priors.
A novel diffusion-based SR framework, which integrates text-aware attention and joint segmentation decoders to recover not only natural details but also the structural fidelity of text regions in degraded real-world images.

MM 2025 (Oral)

AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior

Guoqiang Liang, Qingnan Fan, Bingtao Fu, Jinwei Chen, Hong Gu, Lin Wang.

arXiv / codes

Pioneering a new approach by enhancing the generative capabilities of pretrained T2I models for authentic face restoration, moving beyond traditional model design.
A novel framework, namely AuthFace that achieves highly authentic face restoration results by exploring a face-oriented generative diffusion prior.

📚 Oral/Awarded research

CVPR 2025 (Highlight)

SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos

Yuzheng Liu*, Siyan Dong*, Shuzhe Wang, Yingda Yin, Yanchao Yang, Qingnan Fan, Baoquan Chen.

arXiv / video / codes

Award: China3DV 2025, Top1 paper.
SLAM3R is a real-time dense scene reconstruction system that regresses 3D points from video frames using feed-forward neural networks, without explicitly estimating camera parameters.

SIGGRAPH Asia & TOG 2023

Scene-aware Activity Program Generation with Language Guidance

Zejia Su, Qingnan Fan, Xuelin Chen, Oliver van Kaick, Hui Huang, Ruizhen Hu.

project page / supp file / bibtex

We address the problem of scene-aware activity program generation, which requires decomposing a given activity task into instructions that can be sequentially performed within a target scene to complete the activity.

SIGGRAPH Asia 2023

C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters

Zhiyang Dou, Xuelin Chen, Qingnan Fan, Taku Komura, Wenping Wang.

arXiv / project page / video / bibtex

We present C·ASE, an efficient and effective framework that learns conditional Adversarial Skill Embeddings for Elite physics-based characters.

ICLR 2022

VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects

Ruihai Wu*, Yan Zhao*, Kaichun Mo*, Zizheng Guo, Yian Wang, Tianhao Wu, Qingnan Fan, Xuelin Chen, Leonidas Guibas, Hao Dong.

arXiv / project page / codes / video / bibtex

Award: WAIC 2025, Young Outstanding Paper Award
We design an interaction-for-perception framework, VAT-MART, to learn actionable visual representations for more effective manipulation of 3D articulated objects.

CVPR 2022 (Oral)

ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation

Yanchao Yang* , Hanxiang Ren*, He Wang,
Bokui Shen, Qingnan Fan, Youyi Zheng, C. Karen Liu, Leonidas Guibas.

arXiv / codes / bibtex

We describe a method to deal with performance drop in semantic segmentation caused by viewpoint changes within multi-camera systems, where temporally paired images are readily available, but the annotations may only be abundant for a few typical views.

ICCV 2021 (Oral)

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Yijia Weng*, He Wang*, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan, Baoquan Chen, Hao Su, Leonidas Guibas.

arXiv / project page / codes / video / bibtex

For the first time, we propose a unified framework that can handle 9-DoF pose tracking for novel rigid object instances as well as per-part pose tracking for 3D articulated objects.

CVPR 2021 (Oral)

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Siyan Dong*, Qingnan Fan*, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas Guibas.

arXiv / codes / video / bibtex

A novel outlier-aware neural tree to tackle the camera localization challenges in dynamic indoor environments. It achieves the best performance in the RIO-10 benchmark.

SIGGRAPH Asia & TOG 2018

Image Smoothing via Unsupervised Learning

Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, Xin Tong.

arXiv / codes / supp file / bibtex

Treat deep learning as an optimization tool to minimize the proposed image smoothing objective function in an unsupervised manner. Multiple different smoothing effects can be easily learned by adaptively changing the proposed objective function.

CVPR 2018 (Oral)

Revisiting Deep Intrinsic Image Decompositions

Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf.

arXiv / codes / slides / supp file / poster / bibtex

The first demonstration of a single basic deep architecture capable of achieving state-of-the-art results when applied to each of the major intrinsic benchmarks.

SIGGRAPH Asia & TOG 2015

JumpCut: Non-Successive Mask Transfer and Interpolation for Video Cutout

Qingnan Fan, Fan Zhong, Dani Lischinski, Daniel Cohen-Or, Baoquan Chen.

codes / slides / video / supp file / dataset / bibtex

An interactive real-time video segmentation algorithm. Significantly improve the video cutout accuracy and efficiency.

SIGGRAPH & TOG 2014

Build-to-Last: Strength to Weight 3D Printed Objects

Lin Lu, Andrei Sharf, Haisen Zhao, Yuan Wei, Qingnan Fan, Xuelin Chen, Yann Savoye, Changhe Tu, Daniel Cohen-Or, Baoquan Chen.

video / bibtex

Reduce the material cost and weight of a given object while providing a durable printed model that is resistant to impact and external forces.

🎖 Honors and Awards

2022, Tencent Outstanding Contributor
2020, CCF Doctorial Dissertation Award Nominee (CCF 优博提名)
2018, Academic Star Nominee of Shandong University (10/20000)
2015, Presidential Scholarship of Shandong University (35/20000) (Highest honor for students in SDU, only 35 elected among around 20000 candidates)

📖 Educations

2019.09 - 2021.03, PostDoc, Stanford University
2014.09 - 2019.06, Ph.D., Shandong University
2010.09 - 2014.06, Undergraduate, Shandong University

💬 Invited Talks

2022.04, Active 3D scene understanding and its applications, “三维视觉与智能图形”前沿论坛, 图图名师讲堂
2021.10, Visual Localization, Embodied AI Workshop, Valse
2019.01, Deep Learning in Computational Photography, USC ICT/UW Reality Lab/Berkeley/Stanford/Google/MSR
2018.12, Deep Learning for Single Image Artifact Removal, ACCV Tutorial
2018.12, Image Smoothing via Unsupervised Learning, GAMES Webinar
2018.08, Discovering Unsupervised Learning in Image Processing, CIA, Cambridge University

💻 Internships and Visiting students

2018.04 - 2019.08, Beijing Film Academy, China.
2018.08 - 2018.10, University of Cambridge, UK.
2016.09 - 2018.03, Microsoft Research Asia, China.
2015.04 - 2015.05, Tel Aviv University, Israel.
2014.10 - 2014.11, The Hebrew University of Jerusalem, Israel.