Skip to content

🚀 [2025/10/31] Recent Updates Summary for ROLL Project——ROLL Flash #207

@PanAndy

Description

@PanAndy

Hello everyone! Thank you for your attention to ROLL.
ROLL has recently updated with many new features, with ROLL Flash as the core. ROLL Flash achieves significant improvements in training efficiency through its innovative asynchronous training architecture.
Below is a summary of the code updates. We will continue to iterate and improve ROLL. Welcome to join the ROLL community.

🚀 Core Highlights

  • Asynchronous Training Architecture: Brand-new asynchronous generation scheduler enabling efficient pipeline overlap between generation, reward calculation, and model training
  • Significant Performance Improvements: Up to 2.24× speedup in RLVR tasks and up to 2.72× speedup in Agentic tasks
  • Near-linear Scaling: Maintains near-linear throughput scaling at hundred-card scale, with 8× GPU resources achieving 7.6× efficiency improvement
  • Multi Off-policy Algorithm Support: Integrates multiple off-policy algorithms (Decoupled PPO, TOPR, CISPO, etc.) with performance comparable to synchronous training

🔧 Major New Features

  • ROLLFlash

    • Asynchronous training: docs_roll/docs/English/UserGuide/async_training.md
    • Queue Scheduling mechanism for independent task scheduling to maximize GPU utilization: roll/distributed/scheduler/async_generate_scheduler.py
    • Environment-Level Async Rollout to avoid GPU waiting for environment interactions: docs_roll/docs/English/UserGuide/async_parallel_rollout.md
    • Redundant Environment Rollout capability to improve training robustness: roll/pipeline/agentic/agentic_config.py:37
    • Off-policy algorithms: docs_roll/docs/English/UserGuide/algorithms/offpolicy_setting.md
  • Agentic

    • Adjusted RolloutScheduler implementation for better control over EnvManager interactions: docs_roll/docs/English/UserGuide/agentic/agentic_engineer_practice.md
    • GlobalDataset component for custom env use, avoiding network/memory bottlenecks from individual env data reading
      • Code: roll/datasets/global_dataset.py
      • Documentation: docs_roll/docs/English/UserGuide/agentic/agentic_engineer_practice.md
    • Support for val dataset traversal configuration: docs_roll/docs/English/UserGuide/agentic/agentic_engineer_practice.md
    • Support for trajectory synthesis dump capability: docs_roll/docs/English/UserGuide/agentic/agentic_engineer_practice.md
    • Support for stateful trajectory filtering capability: docs_roll/docs/English/UserGuide/agentic/agentic_engineer_practice.md
  • Performance Optimization & Backend

    • Dynamic batching optimization: roll/utils/dynamic_batching.py
    • Optimized DistillPipeline to improve teacher-student logits transmission efficiency
    • Added complete support for vLLM 0.11.0
  • Documentation

    • FP8 rollout configuration documentation: docs_roll/docs/English/UserGuide/backend/fp8_rollout.md

Whether you are working on mathematical reasoning, code generation, or building real-world interactive LLM agents, ROLL Flash can help you train stronger models faster, more stably, and more cost-effectively.

The ROLL team will continue to deeply cultivate system and algorithm co-innovation for RL in LLM, dedicated to building an easy-to-use, efficient, and scalable open-source ecosystem.
Welcome to Star, try, and contribute code to advance LLM reinforcement learning toward practicality and large-scale deployment! 🌟

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions