<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Reflexion on Zata-砸它</title><link>https://www.zata.cc/tags/reflexion/</link><description>Recent content in Reflexion on Zata-砸它</description><generator>Hugo -- gohugo.io</generator><language>zh-cn</language><copyright>Example Person</copyright><lastBuildDate>Mon, 29 Jun 2026 09:57:58 +0800</lastBuildDate><atom:link href="https://www.zata.cc/tags/reflexion/index.xml" rel="self" type="application/rss+xml"/><item><title>AI Agent Loop 工程：原理、模式与实现</title><link>https://www.zata.cc/p/ai-agent-loop-%E5%B7%A5%E7%A8%8B%E5%8E%9F%E7%90%86%E6%A8%A1%E5%BC%8F%E4%B8%8E%E5%AE%9E%E7%8E%B0/</link><pubDate>Fri, 26 Jun 2026 15:40:07 +0800</pubDate><guid>https://www.zata.cc/p/ai-agent-loop-%E5%B7%A5%E7%A8%8B%E5%8E%9F%E7%90%86%E6%A8%A1%E5%BC%8F%E4%B8%8E%E5%AE%9E%E7%8E%B0/</guid><description>&lt;img src="https://www.zata.cc/images/index/index.png" alt="Featured image of post AI Agent Loop 工程：原理、模式与实现" />&lt;blockquote>
&lt;p>一句话概括 Agent 工程的核心:&lt;strong>把 LLM 放进一个可控的循环里,让它在&amp;quot;思考—行动—观察—反思&amp;quot;之间反复迭代,直到任务收敛。&lt;/strong> 这个&amp;quot;循环&amp;quot;——也就是 Agent Loop——就是本文的主角。&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="0-为什么是-loop">0. 为什么是 &amp;ldquo;Loop&amp;rdquo;?
&lt;/h2>&lt;p>如果你把 LLM 当成&amp;quot;一个函数 &lt;code>f(prompt) -&amp;gt; response&lt;/code>&amp;quot;,那你写出的就是 prompt 工程;但如果你把 LLM 当成&amp;quot;一个可以被调用的决策者&amp;quot;,你就会发现:&lt;strong>几乎所有复杂的 Agent 行为,本质上都是一个循环&lt;/strong>。&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>场景&lt;/th>
&lt;th>单次调用能做到吗?&lt;/th>
&lt;th>为什么需要 Loop?&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>让模型查今天北京天气&lt;/td>
&lt;td>✅&lt;/td>
&lt;td>一次生成即可&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>让模型读完 50 页 PDF 后回答&lt;/td>
&lt;td>❌&lt;/td>
&lt;td>需要循环:分块 → 读取 → 累计 → 汇总&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>让模型调用 5 个 API 完成订单&lt;/td>
&lt;td>❌&lt;/td>
&lt;td>需要循环:规划 → 调 API → 处理异常 → 重试&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>让模型写出能跑通的代码&lt;/td>
&lt;td>❌&lt;/td>
&lt;td>需要循环:写代码 → 执行 → 报错 → 修正&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>让模型通过多轮对话解决开放问题&lt;/td>
&lt;td>❌&lt;/td>
&lt;td>需要循环:追问 → 反思 → 补充 → 收敛&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>Loop 是 Agent 与&amp;quot;普通 LLM 应用&amp;quot;的分水岭&lt;/strong>。没有循环的 LLM 只是个文本生成器;有了循环,它才有机会成为&amp;quot;会思考、会试错、会自我修正&amp;quot;的智能体。&lt;/p>
&lt;p>&lt;img src="https://www.zata.cc/p/ai-agent-loop-%E5%B7%A5%E7%A8%8B%E5%8E%9F%E7%90%86%E6%A8%A1%E5%BC%8F%E4%B8%8E%E5%AE%9E%E7%8E%B0/images/agent-loop-overview.svg"
loading="lazy"
alt="Agent Loop 总览:一次完整的智能体循环由 5 个阶段构成"
>&lt;/p>
&lt;hr>
&lt;h2 id="1-agent-loop-的解剖">1. Agent Loop 的解剖
&lt;/h2>&lt;p>无论你用 ReAct、Reflection 还是 LangGraph,一个标准的 Agent Loop 都由 5 个固定角色组成:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>角色&lt;/th>
&lt;th>作用&lt;/th>
&lt;th>工程实现&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>State(状态)&lt;/strong>&lt;/td>
&lt;td>当前任务的所有上下文:目标、历史、记忆、工具结果&lt;/td>
&lt;td>一个 &lt;code>TypedDict&lt;/code> / Pydantic 模型&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Policy(策略)&lt;/strong>&lt;/td>
&lt;td>决定下一步该做什么&lt;/td>
&lt;td>通常是一次 LLM 调用 + 结构化输出&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Action(动作)&lt;/strong>&lt;/td>
&lt;td>执行策略选定的动作&lt;/td>
&lt;td>工具调用 / 子 Agent 调用 / 写文件&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Observer(观察)&lt;/strong>&lt;/td>
&lt;td>把动作结果回填到状态&lt;/td>
&lt;td>&lt;code>tool_result&lt;/code> 追加到 messages&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Terminator(终止器)&lt;/strong>&lt;/td>
&lt;td>判断循环是否该结束&lt;/td>
&lt;td>显式 finish / 步数上限 / 置信度阈值&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>把这 5 个角色串起来,就是最经典的循环骨架:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">state&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">init_state&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">user_goal&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">for&lt;/span> &lt;span class="n">step&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">range&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">max_steps&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">decision&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">policy&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># LLM 决定下一步&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">terminator&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">decision&lt;/span>&lt;span class="p">):&lt;/span> &lt;span class="c1"># 是否结束?&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">final_answer&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">decision&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">observation&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">action&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">decision&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># 调工具 / 子 Agent&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">state&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">update&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">observation&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># 更新状态&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">return&lt;/span> &lt;span class="n">forced_finish&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># 兜底:步数耗尽&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>后续所有&amp;quot;花式&amp;quot;Loop(ReAct、Reflection、Reflexion、Plan-and-Execute)都只是&lt;strong>对这 5 个角色的不同实现与重组&lt;/strong>。&lt;/p>
&lt;hr>
&lt;h2 id="2-经典-loop-模式">2. 经典 Loop 模式
&lt;/h2>&lt;h3 id="21-reactreason--act推理与行动交替">2.1 ReAct:Reason + Act(推理与行动交替)
&lt;/h3>&lt;p>&lt;strong>ReAct&lt;/strong>(Yao et al., 2022)是最广为人知的 Agent Loop 范式,它强制让 LLM 在每一步都按 &amp;ldquo;Thought → Action → Observation&amp;rdquo; 的格式输出:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">Thought 1: 我需要先查一下用户的订单状态
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Action 1: get_order(order_id=&amp;#34;12345&amp;#34;)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Observation 1: {&amp;#34;status&amp;#34;: &amp;#34;shipped&amp;#34;, &amp;#34;tracking&amp;#34;: &amp;#34;SF123...&amp;#34;}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Thought 2: 订单已发货,接下来需要根据物流信息估算送达时间
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Action 2: get_eta(tracking=&amp;#34;SF123...&amp;#34;)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Observation 2: {&amp;#34;eta&amp;#34;: &amp;#34;2026-06-28&amp;#34;}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Thought 3: 信息齐全,可以回答用户了
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Action 3: finish(answer=&amp;#34;您的订单预计 6 月 28 日送达&amp;#34;)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Loop 视角的关键点&lt;/strong>:ReAct 的循环不是&amp;quot;调 LLM 一次&amp;quot;,而是**&amp;ldquo;调 LLM → 解析动作 → 执行 → 把结果塞回 prompt → 再调 LLM&amp;rdquo;**。LLM 本身是无状态的,Loop 才是它&amp;quot;持续思考&amp;quot;的载体。&lt;/p>
&lt;p>&lt;img src="https://www.zata.cc/p/ai-agent-loop-%E5%B7%A5%E7%A8%8B%E5%8E%9F%E7%90%86%E6%A8%A1%E5%BC%8F%E4%B8%8E%E5%AE%9E%E7%8E%B0/images/react-loop.svg"
loading="lazy"
alt="ReAct Loop 的运行机制:Thought / Action / Observation 持续循环,直到 finish"
>&lt;/p>
&lt;p>&lt;strong>最小可运行的 ReAct Loop&lt;/strong>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">import&lt;/span> &lt;span class="nn">json&lt;/span>&lt;span class="o">,&lt;/span> &lt;span class="nn">re&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">openai&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">OpenAI&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">client&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">OpenAI&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">TOOLS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;get_weather&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="k">lambda&lt;/span> &lt;span class="n">city&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">city&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> 今天晴, 25°C&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;get_time&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="k">lambda&lt;/span> &lt;span class="n">_&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;现在是 2026-06-26 15:00&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">SYSTEM&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;你是一个 Agent。每轮必须严格输出:
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2">Thought: &amp;lt;你的推理&amp;gt;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2">Action: &amp;lt;JSON,形如 {&amp;#34;name&amp;#34;: &amp;#34;工具名&amp;#34;, &amp;#34;args&amp;#34;: {...}} 或 {&amp;#34;name&amp;#34;: &amp;#34;finish&amp;#34;, &amp;#34;args&amp;#34;: {&amp;#34;answer&amp;#34;: &amp;#34;...&amp;#34;}}&amp;gt;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">react_loop&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">user_goal&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">max_steps&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">8&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">messages&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>&lt;span class="s2">&amp;#34;role&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;system&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;content&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">SYSTEM&lt;/span>&lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>&lt;span class="s2">&amp;#34;role&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;user&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;content&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">user_goal&lt;/span>&lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">step&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">range&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">max_steps&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">resp&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">client&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">chat&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">completions&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">create&lt;/span>&lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">model&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;gpt-4o-mini&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">messages&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">messages&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">text&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">resp&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">choices&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">message&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">content&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">messages&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">({&lt;/span>&lt;span class="s2">&amp;#34;role&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;assistant&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;content&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">text&lt;/span>&lt;span class="p">})&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 解析 Action&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">m&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">re&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">search&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sa">r&lt;/span>&lt;span class="s2">&amp;#34;Action:\s*(\{.*\})&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">text&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">re&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">S&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="ow">not&lt;/span> &lt;span class="n">m&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">raise&lt;/span> &lt;span class="ne">ValueError&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;step &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">step&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">: 模型未输出 Action&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">action&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">json&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">loads&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">m&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">group&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">action&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;name&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s2">&amp;#34;finish&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">action&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;args&amp;#34;&lt;/span>&lt;span class="p">][&lt;/span>&lt;span class="s2">&amp;#34;answer&amp;#34;&lt;/span>&lt;span class="p">],&lt;/span> &lt;span class="n">messages&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 执行工具 → Observation&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">obs&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">TOOLS&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">action&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;name&amp;#34;&lt;/span>&lt;span class="p">]](&lt;/span>&lt;span class="o">**&lt;/span>&lt;span class="n">action&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;args&amp;#34;&lt;/span>&lt;span class="p">])&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">messages&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">({&lt;/span>&lt;span class="s2">&amp;#34;role&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;user&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;content&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;Observation: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">obs&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">})&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">raise&lt;/span> &lt;span class="ne">TimeoutError&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;超过最大步数,未收敛&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="22-reflection让模型对自己的输出打分并改写">2.2 Reflection:让模型对自己的输出打分并改写
&lt;/h3>&lt;p>&lt;strong>Reflection&lt;/strong>(Shinn et al., 2023)把 Loop 从&amp;quot;短反馈&amp;quot;升级成&amp;quot;长反馈&amp;quot;:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">[生成阶段] LLM 生成初稿 answer_0
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">[反思阶段] LLM(可以是同一个,也可以是更强的 critic)输出:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> - score: 0~10
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> - critique: 具体问题清单
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> - improved_answer: 改进版
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">[循环 N 次] 或 直至 score &amp;gt;= threshold
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Loop 的关键差异&lt;/strong>:ReAct 的循环驱动来自&amp;quot;环境反馈&amp;quot;(工具返回),Reflection 的循环驱动来自&amp;quot;自我反馈&amp;quot;(LLM 评价自己)。&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">reflection_loop&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">draft_prompt&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">max_rounds&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">3&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">threshold&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">8&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">draft&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">llm&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">draft_prompt&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">history&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">r&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">range&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">max_rounds&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">critique&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">critic_llm&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2"> 请评估以下回答,输出 JSON:
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2"> &lt;/span>&lt;span class="se">{{&lt;/span>&lt;span class="s2">&amp;#34;score&amp;#34;: 0-10, &amp;#34;issues&amp;#34;: [...], &amp;#34;improved&amp;#34;: &amp;#34;...&amp;#34;&lt;/span>&lt;span class="se">}}&lt;/span>&lt;span class="s2">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2"> 原题: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">draft_prompt&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2"> 当前回答: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">draft&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s2"> &amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">score&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">improved&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">critique&lt;/span>&lt;span class="p">)[&lt;/span>&lt;span class="s2">&amp;#34;score&amp;#34;&lt;/span>&lt;span class="p">],&lt;/span> &lt;span class="n">parse&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">critique&lt;/span>&lt;span class="p">)[&lt;/span>&lt;span class="s2">&amp;#34;improved&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">history&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">append&lt;/span>&lt;span class="p">({&lt;/span>&lt;span class="s2">&amp;#34;round&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">r&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;score&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">score&lt;/span>&lt;span class="p">})&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">score&lt;/span> &lt;span class="o">&amp;gt;=&lt;/span> &lt;span class="n">threshold&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">improved&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">history&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">draft&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">improved&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">draft&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">history&lt;/span> &lt;span class="c1"># 兜底:返回最后一轮&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;img src="https://www.zata.cc/p/ai-agent-loop-%E5%B7%A5%E7%A8%8B%E5%8E%9F%E7%90%86%E6%A8%A1%E5%BC%8F%E4%B8%8E%E5%AE%9E%E7%8E%B0/images/reflection-loop.svg"
loading="lazy"
alt="Reflection Loop:生成 → 反思 → 重写,直到质量达标或达到上限"
>&lt;/p>
&lt;h3 id="23-reflexion把反思结果沉淀到长期记忆">2.3 Reflexion:把反思结果&amp;quot;沉淀&amp;quot;到长期记忆
&lt;/h3>&lt;p>Reflexion 是 Reflection 的进化:它不只是&amp;quot;当场改&amp;quot;,还会把反思结果&lt;strong>写入长期记忆&lt;/strong>,让下次遇到类似问题时少走弯路:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">Trajectory → Reflector → Self-Critique
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> Memory Store (vector db)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ↓
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> 下次同类任务 ← 检索相关反思 ←┘
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>工程上常见的实现:用一个独立的 &lt;code>MemoryStore&lt;/code> 对象,每次反射后写入 &lt;code>{situation, lesson, score_delta}&lt;/code>,在新任务的 system prompt 里通过 RAG 检索 top-k 相关反思塞进去。&lt;/p>
&lt;h3 id="24-plan-and-execute先想清楚再动手">2.4 Plan-and-Execute:先想清楚,再动手
&lt;/h3>&lt;p>Plan-and-Execute 把 Loop 拆成&lt;strong>两个嵌套循环&lt;/strong>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">外层 Planner Loop: 任务 → 计划(步骤列表)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">内层 Executor Loop: 对计划中每个 step 反复 ReAct,直至完成
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">外层 Replanner: 若某个 step 失败,回到 Planner 重新规划
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>适合&lt;strong>长流程、阶段性强&lt;/strong>的任务(比如&amp;quot;调研 → 写作 → 校对 → 发布&amp;quot;),能让 Plan 阶段用更强的模型(慢但准),Execute 阶段用更便宜的模型(快且糙)。&lt;/p>
&lt;h3 id="25-camel多智能体角色扮演循环">2.5 CAMEL:多智能体角色扮演循环
&lt;/h3>&lt;p>&lt;strong>CAMEL&lt;/strong>(Communicative Agents for &amp;ldquo;Mind&amp;rdquo; Exploration of LLMs)用两个 Agent(User Proxy + Assistant)在 Loop 中互相对话,中间插入&amp;quot;Inception Prompt&amp;quot;防止角色漂移:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">User Proxy ──► Assistant ──► User Proxy ──► Assistant ...
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ↑ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> └──────────── Critic/Inception Prompt ←──────────────┘
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>适合&lt;strong>对话式博弈、谈判模拟、教学场景&lt;/strong>等&amp;quot;两个角色反复讨论&amp;quot;的场景。&lt;/p>
&lt;hr>
&lt;h2 id="3-loop-的-4-大工程问题">3. Loop 的 4 大工程问题
&lt;/h2>&lt;p>把 Loop 从论文搬到生产,真正难的是下面 4 个问题。&lt;/p>
&lt;h3 id="31-状态管理loop-的内存">3.1 状态管理:Loop 的&amp;quot;内存&amp;quot;
&lt;/h3>&lt;p>&lt;strong>State 是 Loop 工程的命门&lt;/strong>。常见的 State 设计:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">typing&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">TypedDict&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">Annotated&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">List&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">langgraph.graph.message&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">add_messages&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">class&lt;/span> &lt;span class="nc">AgentState&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">TypedDict&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 对话历史(自带 reducer,自动追加)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">messages&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">Annotated&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nb">dict&lt;/span>&lt;span class="p">],&lt;/span> &lt;span class="n">add_messages&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 当前计划(Plan-and-Execute 用)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">plan&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nb">str&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 已执行的步骤与结果&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">past_steps&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nb">tuple&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">]]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 反思/记忆&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">reflections&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nb">str&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 控制字段&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">step_count&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">int&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">is_finished&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">bool&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>设计原则&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>State 必须可序列化&lt;/strong>:能 &lt;code>pickle&lt;/code> / 写 Redis,这是断点续跑、调试回放的前提。&lt;/li>
&lt;li>&lt;strong>State 必须有 schema&lt;/strong>:用 TypedDict 或 Pydantic,避免字段拼写错误。&lt;/li>
&lt;li>&lt;strong>State 的写入要走 reducer&lt;/strong>:特别是 &lt;code>messages&lt;/code>,不能简单覆盖,而要 append,否则 Loop 会&amp;quot;失忆&amp;quot;。&lt;/li>
&lt;/ul>
&lt;h3 id="32-终止条件loop-不能停不下来">3.2 终止条件:Loop 不能&amp;quot;停不下来&amp;quot;
&lt;/h3>&lt;p>没有终止条件的 Loop 就是个无底洞。&lt;strong>生产中至少要有 3 重保险&lt;/strong>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">should_continue&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="n">Literal&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;continue&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;end&amp;#34;&lt;/span>&lt;span class="p">]:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 1. 显式 finish(由 LLM 主动调用 finish 工具)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;is_finished&amp;#34;&lt;/span>&lt;span class="p">]:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;end&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 2. 步数硬上限(防止 token 爆炸)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;step_count&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;gt;=&lt;/span> &lt;span class="mi">20&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;end&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 3. 步数软上限 + 强制收敛(兜底,走 LLM 总结)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;step_count&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;gt;=&lt;/span> &lt;span class="mi">15&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;force_summarize&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># 4. 重复检测(防止模型在两个动作间反复横跳)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">][&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">content&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">][&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">3&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">content&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;end&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;continue&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>真实生产事故&lt;/strong>经常源于:忘了设上限,或者上限设了但没接 billing 报警,一夜烧掉几千美元 Token。&lt;/p>
&lt;h3 id="33-记忆机制loop-跨轮次活下来">3.3 记忆机制:Loop 跨轮次&amp;quot;活下来&amp;quot;
&lt;/h3>&lt;p>Loop 内部的 state 是&lt;strong>短期记忆&lt;/strong>,跨会话需要&lt;strong>长期记忆&lt;/strong>:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>类型&lt;/th>
&lt;th>存储&lt;/th>
&lt;th>写入时机&lt;/th>
&lt;th>检索时机&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>短期(Scratchpad)&lt;/td>
&lt;td>In-memory state&lt;/td>
&lt;td>每一步&lt;/td>
&lt;td>每一步&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>程序性(Procedural)&lt;/td>
&lt;td>系统 prompt&lt;/td>
&lt;td>启动时&lt;/td>
&lt;td>启动时&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>语义性(Semantic)&lt;/td>
&lt;td>Vector DB&lt;/td>
&lt;td>反思后/总结后&lt;/td>
&lt;td>新任务开始&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>情节性(Episodic)&lt;/td>
&lt;td>时序 DB&lt;/td>
&lt;td>任务结束时&lt;/td>
&lt;td>反思阶段&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">class&lt;/span> &lt;span class="nc">LongTermMemory&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">def&lt;/span> &lt;span class="fm">__init__&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">vector_store&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">embedder&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">vs&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">emb&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">vector_store&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">embedder&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">def&lt;/span> &lt;span class="nf">write&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">situation&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">lesson&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">vs&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">add&lt;/span>&lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">text&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;情境: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">situation&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="se">\n&lt;/span>&lt;span class="s2">教训: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">lesson&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">embedding&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">emb&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">embed&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">situation&lt;/span>&lt;span class="p">),&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">def&lt;/span> &lt;span class="nf">recall&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">current_situation&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">k&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">3&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">q&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">emb&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">embed&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">current_situation&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="bp">self&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">vs&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">search&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">q&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">top_k&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">k&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="34-工具调用loop-与现实世界的接口">3.4 工具调用:Loop 与现实世界的接口
&lt;/h3>&lt;p>Loop 里 80% 的 Action 都是&amp;quot;调工具&amp;quot;。工具设计要点:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>工具描述即 Prompt&lt;/strong>——LLM 选错工具,90% 是工具描述写得烂。&lt;/li>
&lt;li>&lt;strong>工具要返回结构化数据&lt;/strong>(JSON)而非自然语言,便于程序解析。&lt;/li>
&lt;li>&lt;strong>工具有超时和重试&lt;/strong>,避免一个慢工具把整个 Loop 拖死。&lt;/li>
&lt;li>&lt;strong>危险操作(写库、发邮件、删文件)走二次确认分支&lt;/strong>,而不是直接执行。&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="nd">@tool&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">send_email&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">to&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">subject&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">body&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;发送邮件(危险操作,需要 confirm=true 才真正发送)。&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="ow">not&lt;/span> &lt;span class="n">confirm&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;[DRY-RUN] 将发送邮件给 &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">to&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">,主题: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">subject&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">_smtp_send&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">to&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">subject&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">body&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="4-用-langgraph-把-loop-工程化">4. 用 LangGraph 把 Loop 工程化
&lt;/h2>&lt;p>手写 ReAct Loop 在 demo 阶段可以,但生产中你需要:可视化、断点续跑、人介入、可观测——这些 LangGraph 已经帮你做好。&lt;/p>
&lt;h3 id="41-最小的-langgraph-agent-loop">4.1 最小的 LangGraph Agent Loop
&lt;/h3>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">typing&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">Literal&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">Annotated&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">List&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">typing_extensions&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">TypedDict&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">langgraph.graph&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">StateGraph&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">START&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">END&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">langgraph.graph.message&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">add_messages&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">langchain_openai&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">ChatOpenAI&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">langchain_core.tools&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">tool&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">langgraph.prebuilt&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">ToolNode&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nd">@tool&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">search&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">query&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;模拟搜索工具。&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;关于 &amp;#39;&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">query&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#39; 的搜索结果: ...&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">tools&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="n">search&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">llm&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">ChatOpenAI&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">model&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;gpt-4o-mini&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">bind_tools&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tools&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">class&lt;/span> &lt;span class="nc">State&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">TypedDict&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">messages&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">Annotated&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="n">List&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">add_messages&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">step_count&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="nb">int&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">agent&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">State&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="n">llm&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">invoke&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">])],&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;step_count&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;step_count&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">should_continue&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">State&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">-&amp;gt;&lt;/span> &lt;span class="n">Literal&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;tools&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">END&lt;/span>&lt;span class="p">]:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">last&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">][&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">last&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">tool_calls&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;tools&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">END&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">graph&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">StateGraph&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">State&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="n">add_node&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;agent&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">agent&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="n">add_node&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;tools&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">ToolNode&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">tools&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="n">add_edge&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">START&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;agent&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="n">add_conditional_edges&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;agent&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">should_continue&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="n">add_edge&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;tools&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;agent&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">.&lt;/span>&lt;span class="n">compile&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">for&lt;/span> &lt;span class="n">chunk&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">graph&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">stream&lt;/span>&lt;span class="p">({&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="p">[(&lt;/span>&lt;span class="s2">&amp;#34;user&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;查一下 LangGraph 最新版本&amp;#34;&lt;/span>&lt;span class="p">)]}):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">chunk&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="42-加-reflection-节点">4.2 加 Reflection 节点
&lt;/h3>&lt;p>在上面的图里多加一个 &lt;code>reflect&lt;/code> 节点和一条边:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">reflect&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">State&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">last_answer&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">][&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">content&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">critique&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">llm&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">invoke&lt;/span>&lt;span class="p">([&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>&lt;span class="s2">&amp;#34;role&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;system&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;content&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;你是质检员,打分 0-10,并指出问题。&amp;#34;&lt;/span>&lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>&lt;span class="s2">&amp;#34;role&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;user&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;content&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;评价: &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">last_answer&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">])&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">score&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">parse_score&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">critique&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">content&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">score&lt;/span> &lt;span class="o">&amp;gt;=&lt;/span> &lt;span class="mi">8&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;step_count&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">&amp;gt;=&lt;/span> &lt;span class="mi">10&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="n">END&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="s2">&amp;#34;agent&amp;#34;&lt;/span> &lt;span class="c1"># 不满意,把 critique 塞回 messages 让 agent 重写&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>完整图就是:&lt;strong>agent → tools → agent → &amp;hellip; → reflect → (END | agent)&lt;/strong>。&lt;/p>
&lt;p>&lt;img src="https://www.zata.cc/p/ai-agent-loop-%E5%B7%A5%E7%A8%8B%E5%8E%9F%E7%90%86%E6%A8%A1%E5%BC%8F%E4%B8%8E%E5%AE%9E%E7%8E%B0/images/langgraph-loop.svg"
loading="lazy"
alt="LangGraph 实现 Agent Loop:节点、边、终止条件"
>&lt;/p>
&lt;h3 id="43-加人介入human-in-the-loop">4.3 加人介入(Human-in-the-Loop)
&lt;/h3>&lt;p>生产中,任何高风险分支都该有&amp;quot;暂停等人确认&amp;quot;的节点:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">graph&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">add_node&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;human_review&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="k">lambda&lt;/span> &lt;span class="n">s&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">s&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># 实际是中断点&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">graph&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">add_edge&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;agent&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;human_review&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">condition&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">lambda&lt;/span> &lt;span class="n">s&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;send_email&amp;#34;&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">str&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">s&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">][&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">]))&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>部署时打开 &lt;code>interrupt_before=[&amp;quot;human_review&amp;quot;]&lt;/code>,LangGraph 会在该节点暂停,把当前 state 持久化,等人通过 API 决策后再 resume。&lt;/p>
&lt;h3 id="44-测试比人介入更重要的一环">4.4 测试:比&amp;quot;人介入&amp;quot;更重要的一环
&lt;/h3>&lt;p>人介入(Human-in-the-Loop)常被当作安全兜底,但它不是目的,而是&lt;strong>过渡手段&lt;/strong>。真正能让 Agent Loop 规模化运转的,是&lt;strong>把&amp;quot;人测&amp;quot;变成&amp;quot;自动测&amp;quot;&lt;/strong>。&lt;/p>
&lt;p>如果测试做得好,Loop 在每一轮迭代后都能自动验证中间产物(代码是否编译、单测是否通过、API 返回是否符合 schema、生成内容是否满足评分标准),那么大量原本需要人工复核的环节就会被自动化覆盖,人工介入才会被压缩到真正的&amp;quot;异常&amp;quot;和&amp;quot;边界&amp;quot;上。反之,如果测试缺位,Loop 跑完后人还是要从头到尾做验收,Agent 带来的效率提升会被人工测试抵消大半。&lt;/p>
&lt;p>所以 Agent Loop 工程里真正值得重点设计的,其实是两个节点:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>需求提出&lt;/strong>:人把目标、验收标准、约束说清楚——这是 Loop 的输入。&lt;/li>
&lt;li>&lt;strong>测试介入&lt;/strong>:用可执行、可自动化的测试把输出验回来——这是 Loop 的收敛判据。&lt;/li>
&lt;/ol>
&lt;p>这个思路其实来源于&lt;strong>芯片设计&lt;/strong>的理念:芯片一旦流片,发现问题就是天价损失;但如果前期验证(仿真、形式验证、原型测试)做得充分,就能把风险挡在量产之前。AI 时代的 Agent 部署也类似——Loop 里多跑几轮测试,多消耗一些 token,边际成本几乎为零;可一旦把有缺陷的产出发布上线,修复成本和业务影响会大得多。&lt;/p>
&lt;p>因此,不要把测试看成&amp;quot;额外的开销&amp;quot;,而要把它当成&lt;strong>用廉价 token 换取上线确定性&lt;/strong>的投资。其余环节(规划、执行、反思)都应该朝着&amp;quot;让需求端到测试端之间的循环尽量少依赖人工&amp;quot;去优化。&lt;/p>
&lt;hr>
&lt;h2 id="5-loop-的可观测性">5. Loop 的可观测性
&lt;/h2>&lt;p>Loop 跑起来后,&lt;strong>你必须能回答这三个问题&lt;/strong>:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>它在每一步想了什么、做了什么?&lt;/strong> → 记录完整的 messages / tool_calls / observations。&lt;/li>
&lt;li>&lt;strong>它为什么没收敛?&lt;/strong> → 终止时把&amp;quot;最后 3 步状态&amp;quot;dump 出来。&lt;/li>
&lt;li>&lt;strong>它花了多少钱/多少时间?&lt;/strong> → 每个 step 单独计费。&lt;/li>
&lt;/ol>
&lt;p>LangGraph 与 LangSmith 集成最省事:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">import&lt;/span> &lt;span class="nn">os&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">environ&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;LANGSMITH_TRACING&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;true&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">environ&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;LANGSMITH_API_KEY&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;...&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># 之后每次 graph.invoke() 都会自动 trace 到 LangSmith&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>自建可观测栈的话,关键埋点:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">with&lt;/span> &lt;span class="n">tracer&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">start_as_current_span&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;agent_loop&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="n">span&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">span&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">set_attribute&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;goal&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">user_goal&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">step&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="nb">range&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">max_steps&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">with&lt;/span> &lt;span class="n">tracer&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">start_as_current_span&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;step_&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">step&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="n">s&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">s&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">set_attribute&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;messages_count&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nb">len&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">state&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;messages&amp;#34;&lt;/span>&lt;span class="p">]))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">s&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">set_attribute&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;tokens_in&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">usage&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">prompt_tokens&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">s&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">set_attribute&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;tokens_out&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">usage&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">completion_tokens&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">...&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="6-loop-的反模式">6. Loop 的反模式
&lt;/h2>&lt;p>写了几年 Agent Loop 后,我总结出&lt;strong>最常踩的几个坑&lt;/strong>:&lt;/p>
&lt;h3 id="61-无限循环--上限过高">6.1 无限循环 / 上限过高
&lt;/h3>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># 反例&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">for&lt;/span> &lt;span class="n">step&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">itertools&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">count&lt;/span>&lt;span class="p">():&lt;/span> &lt;span class="c1"># 真的无限&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">...&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>一定要设硬上限&lt;/strong>,而且上限要和 budget 系统联动。&lt;/p>
&lt;h3 id="62-把所有上下文都塞进-prompt">6.2 把所有上下文都塞进 Prompt
&lt;/h3>&lt;p>Loop 跑 20 步,prompt 里堆 20 轮对话 + 10 个工具结果,Token 直接爆炸。要么用&lt;strong>摘要压缩&lt;/strong>,要么用&lt;strong>外部状态机&lt;/strong>把历史写到外部存储,prompt 里只保留&amp;quot;最近 K 步 + 关键事件&amp;quot;。&lt;/p>
&lt;h3 id="63-没有-fallback-工具调用失败">6.3 没有 fallback 工具调用失败
&lt;/h3>&lt;p>工具超时 / 5xx / 返回错误 JSON 都太常见。&lt;strong>Action 层必须包一层 retry + fallback&lt;/strong>,而不是直接把异常抛给 LLM 让它自己&amp;quot;看着办&amp;quot;。&lt;/p>
&lt;h3 id="64-终止条件依赖-llm-自报">6.4 终止条件依赖 LLM 自报
&lt;/h3>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># 反例&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">done&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">llm&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;请判断任务是否完成,只回答 yes/no&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="s2">&amp;#34;yes&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>LLM 自报&amp;quot;完成&amp;quot;非常不可靠,&lt;strong>用结构化 finish 工具 + 外部校验&lt;/strong>双保险。&lt;/p>
&lt;h3 id="65-跨任务共享同一个-longterm-memory-不做清理">6.5 跨任务共享同一个 LongTerm Memory 不做清理
&lt;/h3>&lt;p>记忆库会越积越杂,质量会越来越差。定期做&lt;strong>记忆蒸馏&lt;/strong>:把多条相似记忆合并成一条;对引用次数为 0 的记忆做 GC。&lt;/p>
&lt;hr>
&lt;h2 id="7-一个生产级-agent-loop-的参考骨架">7. 一个生产级 Agent Loop 的参考骨架
&lt;/h2>&lt;p>综合前文,生产里我推荐的 Agent Loop 骨架长这样:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">┌─────────────────────────────────────────────────┐
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ 主循环 (Agent Loop) │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ┌────────┐ ┌────────┐ ┌─────────────┐ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │ Planner│───►│Executor│───►│ Reflector │ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ └────────┘ └────┬───┘ └──────┬──────┘ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ▲ │ │ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │ ▼ ▼ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │ ┌────────┐ ┌─────────────┐ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ └────────│Replanner│ │Memory Writer│ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ └────────┘ └─────────────┘ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ 三重保险:Step 上限 / Token 上限 / 重复检测 │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ 三层记忆:State / Vector DB / External Store │
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">└─────────────────────────────────────────────────┘
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>对应到 LangGraph,就是带分支的图 + 多个子节点 + 持久化 checkpointer。&lt;strong>所有&amp;quot;花式&amp;quot;Agent 框架的差异,本质上都是这个骨架的不同拓扑&lt;/strong>。&lt;/p>
&lt;hr>
&lt;h2 id="8-总结">8. 总结
&lt;/h2>&lt;ul>
&lt;li>&lt;strong>Agent Loop = 状态 + 策略 + 动作 + 观察 + 终止器&lt;/strong> 的循环。&lt;/li>
&lt;li>经典模式:&lt;strong>ReAct&lt;/strong>(环境反馈)、&lt;strong>Reflection&lt;/strong>(自我反馈)、&lt;strong>Reflexion&lt;/strong>(反馈入长期记忆)、&lt;strong>Plan-and-Execute&lt;/strong>(嵌套循环)、&lt;strong>CAMEL&lt;/strong>(双角色循环)。&lt;/li>
&lt;li>工程化的 4 大难题:&lt;strong>状态管理、终止条件、记忆机制、工具设计&lt;/strong>。&lt;/li>
&lt;li>落地推荐 &lt;strong>LangGraph&lt;/strong>:它把 Loop 的节点、边、终止、持久化、人介入都做成了原生能力。&lt;/li>
&lt;li>&lt;strong>可观测性和硬上限是 Loop 工程的生死线&lt;/strong>——没有它们,Agent 就是个会烧钱、会失控的黑盒。&lt;/li>
&lt;/ul>
&lt;p>掌握了 Agent Loop,你就不再是&amp;quot;在调 LLM&amp;quot;,而是在&lt;strong>设计一个由 LLM 驱动的分布式系统&lt;/strong>——这正是&amp;quot;智能体编排设计工程师&amp;quot;的核心能力。&lt;/p>
&lt;hr>
&lt;h2 id="参考资料">参考资料
&lt;/h2>&lt;ul>
&lt;li>Yao et al., &lt;strong>ReAct: Synergizing Reasoning and Acting in Language Models&lt;/strong>, 2022.&lt;/li>
&lt;li>Shinn et al., &lt;strong>Reflexion: Language Agents with Verbal Reinforcement Learning&lt;/strong>, 2023.&lt;/li>
&lt;li>Wei et al., &lt;strong>Chain-of-Thought Prompting Elicits Reasoning in Large Language Models&lt;/strong>, 2022.&lt;/li>
&lt;li>LangGraph Documentation, &lt;a class="link" href="https://langchain-ai.github.io/langgraph/" target="_blank" rel="noopener"
>https://langchain-ai.github.io/langgraph/&lt;/a>&lt;/li>
&lt;li>《&lt;a class="link" href="https://www.zata.cc/p/%e6%99%ba%e8%83%bd%e4%bd%93%e7%bc%96%e6%8e%92%e8%ae%be%e8%ae%a1%e5%b7%a5%e7%a8%8b%e5%b8%88%e5%ad%a6%e4%b9%a0%e6%8c%87%e5%8d%97/" >智能体编排设计工程师学习指南&lt;/a>》(本系列前篇)&lt;/li>
&lt;/ul></description></item></channel></rss>