Half day, June 3rd/4th, 2026
Call for PapersTest-time scaling, which has shown remarkable success in improving reasoning for large language models, holds significant promise for computer vision and multimodal systems. By allocating additional computation during inference, vision models can enhance accuracy, robustness, and interpretability in complex reasoning tasks. Recent advances in the "thinking with images" paradigm, where models perform visual chain-of-thought reasoning through iterative perception and synthesis, suggest a shift toward visually grounded cognition rather than purely symbolic inference. Extending test-time scaling to this setting could enable adaptive visual reasoning, where models selectively focus computation on ambiguous or conceptually rich regions. Coupled with emerging trends such as multimodal reflection, self-evaluation, and scalable visual generation, this approach paves the way for more general, controllable, and interpretable vision reasoning systems. However, scaling inference on high-dimensional visual inputs remains computationally expensive, efficient allocation of resources is still an open problem, and ensuring robustness, safety, and energy efficiency under expanded test-time computation poses significant challenges.
The 2nd Workshop on Test-time Scaling for Computer Vision (ViSCALE) aims to explore the frontiers of scaling test-time computation in vision models, addressing both theoretical advancements and practical implementations. We will discuss the suitability of test-time scaling for traditional vision tasks like perception and the extensions to multimodal and generative models, towards enhancing performance in critical domains. It will also cover solutions for efficient algorithms, considerations of robustness and safety, and novel problems in computer vision posed by test-time scaling. By bringing together experts, the workshop seeks to foster collaboration and innovation in applying this paradigm to push the limits of computer vision.
We invite submissions of original research papers, work-in-progress papers, and extended abstracts. Topics of interest include but are not limited to:
Submission Guidelines:
All submissions will be handled via OpenReview. The review process is double-blind. All papers must be formatted using the CVPR 2026 Author Kit. We welcome different types of submissions to the workshop, including:
We strongly encourage authors to carefully follow the CVPR Author Guidelines, since our workshop will adhere to the same formatting and submission policies as the main conference.
Submission Begins
Submission Deadline
Author Notification and Meta-Data Collection
Camera-Ready Submission
Workshop Day
TBD
Explore previous workshops and their contributions to the field
Test-time Scaling for Computer Vision
For any inquiries, feel free to reach out to us via email at: viscalecvpr@gmail.com. or You may also contact the organizers directly: Yinpeng Dong, Yichi Zhang.