suu990901

Zhenpeng Su suu990901

Researcher @ Kuaishou | M.E. @ UCAS | Focus on LLM Reasoning & Mixture-of-Experts

Achievements

KlearReasoner KlearReasoner Public

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Python 81 9
LLaMA-MiLe-Loss LLaMA-MiLe-Loss Public

Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models

Python 67 6
chatgpt-comparison-detection-HC3-Plus chatgpt-comparison-detection-HC3-Plus Public

Code for HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

Python 6 2
Dial-MAE Dial-MAE Public

Code for Dial-MAE:ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems

Python 6
gem gem Public

Forked from axon-rl/gem

A Gym for Agentic LLMs

Python 2
ARPO ARPO Public

Forked from RUC-NLPIR/ARPO

The official code of “Agentic Reinforced Policy Optimization”, an agentic RL algorithm optimization.

Python 1