Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

OPPO Personal AI Lab

OPPO Research Institute
Date: August 7, 2025

Overview

We introduce Chain-of-Agents (CoA), a novel framework for training end-to-end agent foundation models (AFM) using multi-agent distillation and agentic reinforcement learning. Our approach addresses key challenges in developing versatile AI agents that can perform complex tasks across diverse domains.

The framework consists of two main components:

Multi-Agent Distillation: Distills knowledge from multiple specialized agents into a single foundation model
Agentic Reinforcement Learning: Fine-tunes the model using reinforcement learning with tool calling capabilities

Key Features

CoA Distillation

Distills knowledge from multiple specialized agents into a unified foundation model

Tool Calling

Enhanced reinforcement learning with tool calling capabilities for complex tasks

End-to-End

Complete pipeline from data processing to model evaluation and deployment

Results

Our Chain-of-Agents Distillation framework demonstrates significant improvements over existing methods across multiple benchmarks. The results show that our approach effectively combines multi-agent distillation with agentic reinforcement learning to produce high-performing foundation models.

Dataset & Model

We provide comprehensive resources for AFM development, including training datasets and pre-trained models. These resources support both web agent and code agent implementations, available in 7B and 32B parameter sizes with RL and SFT (See Dataset Collections ) variants.

Datasets

Dataset Name	Link
AFM-CodeAgent-SFT-Dataset	View Dataset
AFM-CodeAgent-RL-Dataset	View Dataset
AFM-WebAgent-RL-Dataset	View Dataset
AAFM-MHQA-Agent-SFT-Dataset	View Dataset
AFM-MHQA-RL-Dataset	View Dataset

Models

Model Name	Link
AFM-WebAgent-32B-RL	View Model
AFM-WebAgent-7B-RL	View Model
AFM-MHQA-Agent-3B-RL	View Model
AFM-MHQA-Agent-7B-RL	View Model
AFM-CodeAgent-32B-RL	View Model
AFM-CodeAgent-7B-RL	View Model

Technical Report

Citation

If you find our project helpful, please cite:

@misc{li2025chainofagentsendtoendagentfoundation,
      title={Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL}, 
      author={Weizhen Li and Jianbo Lin and Zhuosong Jiang and Jingyi Cao and Xinpeng Liu and Jiayu Zhang and Zhenqiang Huang and Qianben Chen and Weichen Sun and Qiexiang Wang and Hongxuan Lu and Tianrui Qin and Chenghao Zhu and Yi Yao and Shuying Fan and Xiaowan Li and Tiannan Wang and Pai Liu and King Zhu and He Zhu and Dingfeng Shi and Piaohong Wang and Yeyi Guan and Xiangru Tang and Minghao Liu and Yuchen Eleanor Jiang and Jian Yang and Jiaheng Liu and Ge Zhang and Wangchunshu Zhou},
      year={2025},
      eprint={2508.13167},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2508.13167}, 
}