Wei Fu

fuwth17 AT gmail DOT com

mypic.jpg

I’m currently a fifth-year PhD student in computer science at IIIS, Tsinghua University (expected graduation: June 2026). I am very fortunate to be advised by Professor Yi Wu. Prior to my PhD studies, I received my BEng degree from the Department of Electrical Engineering in 2021.

My research interests lie at the intersection of reinforcement learning (RL) and distributed systems. Currently, I focus on developing AReaL — the fastest and easiest way to scale up agentic RL training for LLM/VLMs. The project has accumulated 2.5k GitHub stars as of September 2025.

I’d describe myself more as a programmer than a researcher.

I enjoy coding, cooking, listening to pop music, and working out at the gym.

News

Aug 09, 2025 Introducing ASearcher — a 32B search agent trained with AReaL that exhibits extreme long-horizon agentic execution capability. It achieves SOTA performance compared with other open-source models of the same size. Check out our open-sourced code, dataset, and checkpoints!
Aug 01, 2025 We refactored the legacy AReaL code into AReaL-lite, a light-weight and algorithm-first codebase that prioritizes better development experiences for AI researchers. Check out the quickstart guide to begin your journey with AReaL-lite!
Jun 23, 2024 Introducing ReaLHF, a highly efficient system for RLHF training of LLMs. It can achieve up to 10x higher training speedup than existing open-source systems! Check our open-sourced code and the documentation to get started with ReaL quickly! 🚀

Selected Publications

  1. ASearcher
    asearcher.png
    Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL
    Jiaxuan Gao , Wei Fu, Minyang Xie , Shusheng Xu , Chuyi He , Zhiyu Mei , Banghua Zhu , and Yi Wu
    Aug 2025
  2. AReaL
    areal.png
    AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
    Wei Fu, Jiaxuan Gao , Xujie Shen , Chen Zhu , Zhiyu Mei , Chuyi He , Shusheng Xu , Guo Wei , Jun Mei , Jiashu Wang , and 3 more authors
    May 2025
  3. ReaLHF
    realhf.png
    ReaL: Efficient RLHF Training of Large Language Models with Parameter Reallocation
    Zhiyu Mei , Wei Fu, Kaiwei Li , Guangju Wang , Huanchen Zhang , and Yi Wu
    In MLSys 2025 (*: Equal Contribution) , May 2025
  4. dpo.png
    Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
    Shusheng Xu , Wei Fu, Jiaxuan Gao , Wenjie Ye , Weilin Liu , Zhiyu Mei , Guangju Wang , Chao Yu , and Yi Wu
    ICML. (Oral) , Jul 2024
  5. SRL
    srl.png
    SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores
    Zhiyu Mei* , Wei Fu*, Guangju Wang , Huanchen Zhang , and Yi Wu
    ICLR. (*: Equal Contribution) , May 2024
  6. ar.png
    Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
    Wei Fu, Chao Yu , Zelai Xu , Jiaqi Yang , and Yi Wu
    ICML, Jul 2022