Publications

2024

  1. ReaLHF
    realhf.png
    ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation
    Zhiyu Mei* , Wei Fu*, Kaiwei Li , Guangju Wang , Huanchen Zhang , and Yi Wu
    Preprint (*: Equal Contribution) , Jul 2024
  2. dpo.png
    Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
    Shusheng Xu , Wei Fu, Jiaxuan Gao , Wenjie Ye , Weilin Liu , Zhiyu Mei , Guangju Wang , Chao Yu , and Yi Wu
    ICML. (Oral) , Jul 2024
  3. SRL
    srl.png
    SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores
    Zhiyu Mei* , Wei Fu*, Guangju Wang , Huanchen Zhang , and Yi Wu
    ICLR. (*: Equal Contribution) , May 2024
  4. agile.png
    Learning Agile Bipedal Motions on a Quadrupedal Robot
    Yunfei Li , Jinhan Li , Wei Fu, and Yi Wu
    ICRA, May 2024

2023

  1. SIPO
    sipo.gif
    Iteratively Learn Diverse Strategies with State Distance Information
    Wei Fu, Weihua Du* , Jingwei Li* , Sunli Chen , Jingzhao Zhang , and Yi Wu
    NeurIPS. (*: Equal Contribution) , Dec 2023

2022

  1. ar.png
    Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
    Wei Fu, Chao Yu , Zelai Xu , Jiaqi Yang , and Yi Wu
    ICML, Jul 2022
  2. RSPO
    smac.gif
    Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization
    Zihan Zhou* , Wei Fu*, Bingliang Zhang , and Yi Wu
    ICLR. (*: Equal Contribution) , Apr 2022

2021

  1. amappo.png
    Unlocking the Potential of MAPPO with Asynchronous Optimization
    Wei Fu, Chao Yu , Yunfei Li , and Yi Wu
    In CICAI , Oral , Jun 2021