第7章 AutoGen：会话式多 Agent — AI Agent 框架教程

AutoGen 的设计哲学

AutoGen（微软研究院开源）的核心思想是：将多 Agent 协作抽象为对话（Conversation）。每个 Agent 都是一个对话参与者，通过消息传递来协作完成任务。

AutoGen 0.4 相比 0.2 进行了全面重构，引入了基于异步消息传递的新架构，支持更灵活的 Agent 拓扑和更好的并发性能。

AssistantAgent（助手智能体）

由 LLM 驱动的智能体，负责生成回复、代码和计划。它是"大脑"，接收任务描述、生成解决方案，并根据反馈进行修正。可以配置 system_message 定义其角色和行为规范。

UserProxyAgent（用户代理智能体）

代表用户的智能体，主要职责是执行 AssistantAgent 生成的代码，并将执行结果反馈给 Assistant。可以配置为全自动（human_input_mode=NEVER）或半自动（需要确认）。

GroupChat（群聊）

多个 Agent 参与的对话场景，由 GroupChatManager 控制发言顺序。支持 round_robin（轮流）、auto（LLM 决定谁发言）和 manual（手动控制）三种模式。

ConversableAgent（可对话智能体）

AutoGen 0.4 的基础类，所有 Agent 类型的父类。支持函数注册、嵌套对话、消息过滤等高级功能。

AutoGen 0.4 架构图

AutoGen 0.4 异步消息架构： ┌─────────────────────────────────────────────────────┐ │ Runtime（运行时） │ │ ┌─────────────────────────────────────────────────┐ │ │ │ 消息总线（Message Bus） │ │ │ └──────┬──────────────────────────┬───────────────┘ │ │ │ │ │ │ ┌──────▼──────┐ ┌────────▼──────┐ │ │ │ Assistant │ 消息传递 │ UserProxy │ │ │ │ Agent │◄─────────►│ Agent │ │ │ │ (LLM驱动) │ │ (代码执行) │ │ │ └─────────────┘ └───────────────┘ │ │ │ │ │ ┌───────▼──────┐ │ │ │ CodeExecutor│ │ │ │ (沙箱环境) │ │ │ └──────────────┘ │ └─────────────────────────────────────────────────────┘ 对话流程（代码生成任务）： User: "写一个Python函数计算斐波那契数列" ↓ Assistant: "好的，这是代码：def fib(n):..." ↓ UserProxy: [执行代码] → "代码运行成功，输出：[1,1,2,3,5...]" ↓ Assistant: "代码运行正常，需要我加上缓存优化吗？" ↓ UserProxy: "是的" (或自动继续) ↓ 循环直到任务完成或达到轮次上限

AutoGen 0.4 基础用法

import asyncio
from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models import OpenAIChatCompletionClient

# ── 配置模型客户端 ────────────────────────────────────────
model_client = OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    api_key="your-api-key",
    temperature=0
)

# ── 创建 AssistantAgent ───────────────────────────────────
assistant = AssistantAgent(
    name="Python专家",
    model_client=model_client,
    system_message="""你是一位资深 Python 工程师。
    职责：编写高质量、有注释的Python代码，解释设计决策。
    当代码经过测试且运行正常时，在消息末尾写 'TASK_COMPLETE'。
    代码必须包含：类型注解、错误处理、单元测试。"""
)

# ── 创建 UserProxyAgent ───────────────────────────────────
user_proxy = UserProxyAgent(
    name="代码测试员",
    # NEVER：全自动执行，不需要人工输入
    # TERMINATE：执行后询问是否继续
    # ALWAYS：每次都需要人工输入
)

# ── 设置终止条件 ──────────────────────────────────────────
termination = TextMentionTermination("TASK_COMPLETE")

# ── 两人对话 ─────────────────────────────────────────────
async def run_coding_task():
    team = RoundRobinGroupChat(
        participants=[assistant, user_proxy],
        termination_condition=termination,
        max_turns=10
    )

    result = await team.run(
        task="""实现一个线程安全的缓存装饰器 @lru_cache_threadsafe，
        支持 maxsize 参数，并写完整的测试用例。"""
    )

    async for msg in result.messages:
        print(f"[{msg.source}] {msg.content[:200]}\n")

asyncio.run(run_coding_task())

代码执行沙箱

AutoGen 支持多种代码执行环境，生产环境必须使用隔离沙箱：

from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor
from autogen_agentchat.agents import CodeExecutorAgent
import tempfile
from pathlib import Path

# ── 方案1：本地执行（仅开发/测试环境）───────────────────
work_dir = Path(tempfile.mkdtemp())
local_executor = LocalCommandLineCodeExecutor(
    work_dir=work_dir,
    timeout=30,  # 执行超时（秒）
    virtual_env_context=None
)

# ── 方案2：Docker 隔离沙箱（推荐生产环境）──────────────
docker_executor = DockerCommandLineCodeExecutor(
    image="python:3.11-slim",      # 使用最小化镜像
    timeout=60,
    work_dir=work_dir,
    bind_dir=work_dir,             # 挂载工作目录
    auto_remove=True,             # 执行后自动删除容器
)

# ── 创建代码执行 Agent ────────────────────────────────────
async def create_executor_agent():
    async with DockerCommandLineCodeExecutor(
        image="python:3.11-slim",
        timeout=60
    ) as executor:
        code_agent = CodeExecutorAgent(
            name="安全代码执行器",
            code_executor=executor
        )
        yield code_agent

# ── 限制可用的 Python 包 ──────────────────────────────────
# Dockerfile 示例（只安装必要包，减少攻击面）：
# FROM python:3.11-slim
# RUN pip install numpy pandas matplotlib --no-cache-dir
# RUN useradd -m -u 1000 sandbox
# USER sandbox  # 非 root 用户运行

GroupChat：多 Agent 群聊

from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.conditions import MaxMessageTermination

# 构建专业团队
architect = AssistantAgent(
    name="架构师",
    model_client=model_client,
    system_message="""系统架构专家。负责设计整体方案、
    技术选型、评审代码是否符合架构原则。"""
)

developer = AssistantAgent(
    name="开发者",
    model_client=model_client,
    system_message="""全栈开发工程师。根据架构师的设计编写代码，
    处理具体实现细节，修复 bug。"""
)

tester = AssistantAgent(
    name="测试工程师",
    model_client=model_client,
    system_message="""QA 工程师。编写测试用例，发现边界条件和潜在 bug，
    验证代码质量，当所有测试通过时说 'ALL_TESTS_PASSED'。"""
)

code_executor = CodeExecutorAgent(
    name="代码执行器",
    code_executor=local_executor
)

# SelectorGroupChat：由 LLM 智能决定下一个发言者
termination = (
    TextMentionTermination("ALL_TESTS_PASSED") |
    MaxMessageTermination(30)
)

team = SelectorGroupChat(
    participants=[architect, developer, tester, code_executor],
    model_client=model_client,
    termination_condition=termination,
    selector_prompt="""基于对话历史，选择最合适的下一个发言者。
    当前参与者：{participants}
    对话历史：{history}
    选择规则：设计阶段→架构师；实现阶段→开发者；
              代码生成后→代码执行器；有错误→开发者；验证阶段→测试工程师。
    下一个发言者："""
)

async def run_team():
    result = await team.run(
        task="设计并实现一个基于 Redis 的分布式锁，支持超时和重入。"
    )
    return result

asyncio.run(run_team())

AutoGen vs CrewAI vs LangGraph 三框架对比

特性	AutoGen 0.4	CrewAI 0.7	LangGraph 0.2
核心范式	对话式消息传递	角色扮演任务分工	有向图状态机
代码执行	原生支持，Docker 沙箱	需要自定义工具	需要自定义节点
异步支持	原生 async/await	部分支持	支持 ainvoke
学习曲线	中（需理解消息架构）	低（高层抽象）	中高（图概念）
最适场景	代码生成/调试/数据分析	内容创作/研究报告	复杂业务流程/生产
可控性	中（对话过程难以精确控制）	中（流程可配置）	高（完全显式控制）

AutoGen 代码执行的安全风险

代码执行沙箱缺失：AutoGen 的 UserProxyAgent 默认可以在本地执行 LLM 生成的任意 Python 代码。这在开发环境中便利，但生产环境中极度危险——恶意 Prompt 或 LLM 幻觉可能导致生成删除文件、泄露环境变量、发起网络请求等代码。生产环境必须使用 DockerCommandLineCodeExecutor 在容器中隔离执行。

无限对话循环：未设置 max_consecutive_auto_reply 的 AutoGen 群聊可能在 Agent 之间无休止地对话。建议始终设置该参数，通常 10-15 轮足够完成复杂任务；超过 20 轮几乎肯定说明陷入了无效循环。

GroupChat 发言权争抢：在 GroupChatManager 的默认 speaker_selection_method="auto" 下，LLM 决定下一个发言者。当 Agent 数量超过 5 个时，LLM 的选择准确率下降明显。建议改用 round_robin 或自定义选择函数（speaker_selection_method=custom_fn）。

AutoGen 最佳应用场景 AutoGen 特别适合"代码生成 + 自动执行 + 错误修正"的循环任务：数据分析脚本生成、算法实现验证、自动化测试编写、以及任何需要"写代码→运行→看结果→修改"循环的任务。在这类场景下，它比 CrewAI 和 LangGraph 都更自然。

← 上一章 RAG + Agent 融合下一章 → Agent 可观测性与评估