Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

OPPO Research Institute
Date: August 7, 2025

🔥News!

  • [2025/08] We released datasets, model checkpoints and train & inference code of AFM. [New!]

Overview

Overview

We introduce Chain-of-Agents (CoA), a novel framework for training end-to-end agent foundation models (AFM) using multi-agent distillation and agentic reinforcement learning. Our approach addresses key challenges in developing versatile AI agents that can perform complex tasks across diverse domains.

The framework consists of two main components:

  1. Multi-Agent Distillation: Distills knowledge from multiple specialized agents into a single foundation model
  2. Agentic Reinforcement Learning: Fine-tunes the model using reinforcement learning with tool calling capabilities

Key Features

CoA Distillation

Distills knowledge from multiple specialized agents into a unified foundation model

Tool Calling

Enhanced reinforcement learning with tool calling capabilities for complex tasks

End-to-End

Complete pipeline from data processing to model evaluation and deployment

Results

Results

Our Chain-of-Agents Distillation framework demonstrates significant improvements over existing methods across multiple benchmarks. The results show that our approach effectively combines multi-agent distillation with agentic reinforcement learning to produce high-performing foundation models.

Dataset & Model

We provide comprehensive resources for AFM development, including training datasets and pre-trained models. These resources support both web agent and code agent implementations, available in 7B and 32B parameter sizes with RL and SFT (See Dataset Collections ) variants.

Datasets

Dataset Name Link
AFM-CodeAgent-SFT-Dataset View Dataset
AFM-CodeAgent-RL-Dataset View Dataset
AFM-WebAgent-RL-Dataset View Dataset
AAFM-MHQA-Agent-SFT-Dataset View Dataset
AFM-MHQA-RL-Dataset View Dataset

Models

Model Name Link
AFM-WebAgent-32B-RL View Model
AFM-WebAgent-7B-RL View Model
AFM-MHQA-Agent-3B-RL View Model
AFM-MHQA-Agent-7B-RL View Model
AFM-CodeAgent-32B-RL View Model
AFM-CodeAgent-7B-RL View Model

Technical Report

Citation

If you find our project helpful, please cite:

@article{chain-of-agents-2025,
  title={Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL},
  author={OPPO PersonalAI Lab},
  journal={arXiv preprint arXiv:xxxx.xxxxx},
  year={2025}
}