主题归档 2026-05-12 ★★★★★ Hermes Agent Agent Architecture Source Code Gateway Tool Calling MCP

#Hermes Agent 源码解读：从入口、主循环到 Gateway 的完整架构

这篇文章基于 /usr/local/lib/hermes-agent 的本地源码阅读，不把 Hermes 当成一个“会聊天的 CLI”来讲，而是把它当成一个 Agent 运行时系统来拆：它有多个入口、多种模型协议、多套工具面、多平台 Gateway、长期会话存储、后台任务、插件和安全边界。

一句话结论：Hermes Agent 的核心是 run_agent.py:AIAgent 这条共享 Agent loop；CLI、Gateway、Cron、TUI、ACP、API server、插件平台最终都在不同程度上把输入整理成消息、配置、工具集合和会话状态，然后交给同一个主循环执行。

人话版心智模型是：Hermes 像一个 Agent 操作系统。CLI 是本地终端入口，Gateway 是消息平台入口，Provider/Transport 是模型网卡，Tool Registry 是设备驱动表，SessionDB 是文件系统，Memory/Skills 是长期知识，Cron/Webhook 是后台唤醒器，Plugin/MCP 是扩展总线。

技术名词版是：Hermes 采用 registry-driven extension、provider profile + transport abstraction、toolset-based capability gating、session-scoped prompt snapshot、SQLite-backed session persistence、platform adapter abstraction、Gateway active-session state machine、lossy context compression，以及 contextvars-backed approval boundary。

#1. 总体架构图

用户入口
  |
  |-- CLI: hermes_cli/main.py -> cli.py:HermesCLI
  |-- Gateway: gateway/platforms/* -> gateway/run.py:GatewayRunner
  |-- Cron/Webhook: cron/scheduler.py / gateway/platforms/webhook.py
  |-- TUI/ACP/API: tui_gateway/server.py / acp_adapter/server.py / gateway adapter
  |
  v
运行时配置解析
  hermes_cli/config.py
  hermes_cli/runtime_provider.py
  gateway/config.py
  agent/credential_pool.py
  |
  v
Agent 核心
  run_agent.py:AIAgent
    - 构造稳定 system prompt
    - 拼 API messages
    - 调用 ProviderTransport
    - 处理 streaming / retry / fallback
    - 执行 tool calls
    - 触发 compression / memory / plugin hooks
    - 写入 SessionDB
  |
  +--> 模型层
  |      providers/base.py:ProviderProfile
  |      providers/__init__.py
  |      agent/transports/*
  |
  +--> 工具层
  |      tools/registry.py:ToolRegistry
  |      toolsets.py
  |      model_tools.py
  |      tools/*.py / tools/mcp_tool.py / plugins
  |
  +--> 上下文层
  |      agent/prompt_builder.py
  |      agent/memory_manager.py
  |      tools/memory_tool.py
  |      tools/skills_tool.py
  |      agent/context_compressor.py
  |
  v
状态与投递
  hermes_state.py:SessionDB
  gateway/session.py:SessionStore
  gateway/delivery.py
  platform adapter.send()

读源码时最重要的边界是：入口层不应该自己实现 Agent 智能；它只负责把平台、配置、用户输入和会话坐标归一化。真正的模型调用、工具调用、状态收尾都在 run_agent.py 周围。

#2. 请求生命周期：一条消息如何跑完

下面这条链路以 Gateway 消息为例，因为它覆盖的系统最多。CLI 比它短，Cron/Webhook/TUI/ACP 是类似变体。

1. 平台收到消息
   gateway/platforms/telegram.py / discord.py / feishu.py / webhook.py
   -> 生成 gateway/platforms/base.py:MessageEvent

2. Gateway 做入口分流
   gateway/run.py:GatewayRunner._handle_message()
   -> plugin pre_gateway_dispatch
   -> 用户授权 / pairing
   -> slash command / approve / yolo / reload-mcp / queue / interrupt
   -> running-agent busy 策略

3. 绑定会话坐标
   gateway/session.py:SessionStore.get_or_create_session()
   -> session_key 映射到 hermes_state.py 的 session_id
   -> build_session_context_prompt() 生成平台上下文

4. 创建或复用 AIAgent
   gateway/run.py:_handle_message_with_agent()
   -> _agent_cache[session_key] 命中则复用
   -> 否则用 runtime_provider 结果创建 run_agent.py:AIAgent

5. AIAgent 开始主循环
   run_agent.py:AIAgent.run_conversation()
   -> 加用户消息
   -> 恢复或构造 session-scoped system prompt
   -> preflight context compression
   -> memory prefetch / plugin pre_llm_call 注入到 API copy

6. 调用模型
   run_agent.py:_get_transport()
   -> agent/transports/chat_completions.py 或 codex.py 等
   -> providers/base.py:ProviderProfile 提供 provider quirks

7. 执行工具
   模型返回 tool_calls
   -> run_agent.py 校验工具名和 JSON 参数
   -> model_tools.py:handle_function_call()
   -> tools/registry.py:ToolRegistry.dispatch()
   -> tools/*.py 具体 handler
   -> 工具结果作为 role=tool 回到 messages

8. 循环直到最终响应
   -> 模型可继续读工具结果
   -> 可能触发 retry / fallback / continuation / compression
   -> 最终 assistant response

9. 持久化和投递
   -> hermes_state.py:SessionDB.append_message()
   -> token/cost/cache stats 写回 session
   -> Gateway adapter.send() 发回平台

这个生命周期解释了为什么 Hermes 的源码看起来“大”：它不是单轮 client.chat.completions.create() 包装器，而是一个要在中断、重启、长会话、工具失败、provider 降级、多平台投递后仍能继续工作的运行时。

#3. 源码阅读路线

建议按控制流读，不要按目录从上到下读。

想理解什么	先读哪里	再读哪里	读完应该形成的心智模型
CLI 如何启动	`hermes_cli/main.py:cmd_chat()`	`cli.py:HermesCLI.chat()`、`hermes_cli/runtime_provider.py`	CLI 做参数、配置、凭证和交互壳，Agent loop 不在 CLI 里
主循环	`run_agent.py:class AIAgent`	`_build_system_prompt()`、`run_conversation()`、`_execute_tool_calls*()`	一次 turn 是 message loop + API loop + tool loop + persistence
模型接入	`hermes_cli/runtime_provider.py:resolve_runtime_provider()`	`providers/base.py`、`agent/transports/base.py`、`agent/transports/chat_completions.py`	provider 是服务商配置，transport 是协议格式
工具系统	`tools/registry.py`	`toolsets.py`、`model_tools.py`、具体 `tools/*.py`	registry 注册工具，toolset 决定暴露面，model_tools 做 schema 和 dispatch
Skills/Memory	`tools/skills_tool.py`、`tools/memory_tool.py`	`agent/prompt_builder.py`、`agent/memory_manager.py`	Skills 是渐进披露的操作手册，Memory 是稳定事实和外部召回
会话	`hermes_state.py:SessionDB`	`gateway/session.py:SessionStore`	SQLite 管聊天记录，Gateway JSON index 管平台位置到 session_id 的映射
Gateway	`gateway/run.py:GatewayRunner`	`gateway/platforms/base.py`、具体平台 adapter	平台 adapter 只做翻译，GatewayRunner 做授权、命令、busy、agent lifecycle
Cron/Webhook	`cron/jobs.py`、`cron/scheduler.py`	`tools/cronjob_tools.py`、`gateway/platforms/webhook.py`	后台任务最终也复用 AIAgent，除非 no_agent/deliver_only
插件/MCP	`hermes_cli/plugins.py`	`tools/mcp_tool.py`、`gateway/platform_registry.py`	插件进程内注册能力，MCP 外部 server 最终也注册成 Hermes tool
TUI/ACP	`tui_gateway/server.py`	`acp_adapter/server.py`	它们是本地协议适配层，核心仍是 AIAgent
安全	`tools/approval.py`	`tools/terminal_tool.py`、`gateway/run.py`、`hermes_logging.py`	安全边界分布在入口、工具、审批、日志和 sandbox

#4. CLI 与配置流：入口做“启动仪式”，不做核心智能

Hermes 的命令入口主要在 hermes_cli/main.py。cmd_chat() 负责进入交互聊天，cmd_gateway() 负责启动 Gateway。这个文件很大，但核心作用是命令路由和启动前检查。

#4.1 配置在哪里变成运行时参数

hermes_cli/config.py 管 Hermes home、profile、默认配置、用户 config.yaml、.env 和若干兼容字段。关键函数包括：

ensure_hermes_home()：首次启动时创建 Hermes 目录结构。
get_config_path()：定位当前 profile 下的配置文件。
load_config()：合并默认配置和用户配置，并做字段规范化。

人话解释：配置层的职责不是“告诉模型该怎么回答”，而是把用户机器上的模型、工具、凭证、Gateway、安全策略和 profile 统一成一份运行时可消费的字典。

技术名词：configuration normalization + profile isolation。

#4.2 Provider 解析不在 AIAgent 里完成

hermes_cli/runtime_provider.py:resolve_runtime_provider() 是进入模型层前的重要桥。它把 provider、model、base_url、api_key、api_mode、credential pool 和 OAuth token 解析成 Agent 可用的运行时对象。

这一步为什么需要单独模块？因为模型“可调用”不等于配置文件里有一个模型名。源码里能看到很多现实分支：

自定义 endpoint 根据 URL 推断 api_mode。
Codex、Nous、Qwen、Gemini、MiniMax 等 OAuth 凭证需要运行时刷新。
Bedrock 走 AWS credential chain。
OpenCode/Copilot/Azure Foundry 可能要按模型名反推协议。
credential pool 可能替代单个 API key。

CLI 的 cli.py:HermesCLI._ensure_runtime_credentials() 会在每轮对话前重新解析凭证，目的是支持 token refresh、key rotation、/model 切换和 fallback。然后 HermesCLI.chat() 创建或更新 AIAgent，把用户输入交给 run_conversation()。

#4.3 CLI 与 Gateway 的区别

CLI 是一个用户本地、单进程、交互式入口。Gateway 是多平台、长期运行、可并发、有授权和恢复需求的入口。但两者最终都把这些东西交给 AIAgent：

项	CLI	Gateway
用户输入	prompt_toolkit / stdin	platform webhook/polling/event
会话坐标	当前 CLI session	platform + chat + thread + user -> session_key
凭证解析	每轮 `_ensure_runtime_credentials()`	per-session runtime route + overrides
中断	本地键盘/命令	running-agent fast path、interrupt/queue/steer
回复	终端输出	adapter.send()、分片、thread metadata、media
核心执行	`AIAgent.run_conversation()`	同左

#5. `run_agent.py:AIAgent`：共享主循环

run_agent.py 是 Hermes 的核心文件，也是维护风险最大的文件。inventory 显示它接近 790 KB。读它不要试图一次读完，先抓四个位置：

class AIAgent.__init__()：初始化 provider/client/tools/session/memory/compressor/prompt cache/credential pool。
_build_system_prompt()：拼系统提示。
run_conversation()：主循环。
_execute_tool_calls()、_execute_tool_calls_sequential()、_execute_tool_calls_concurrent()：工具执行。

#5.1 系统提示是 session 级快照

run_agent.py:_build_system_prompt() 的注释明确说明：system prompt 在一个 session 内缓存，只有压缩等事件后才重建，目的是最大化 prefix cache 命中。

它按层拼接：

SOUL.md / DEFAULT_AGENT_IDENTITY
  + Hermes help guidance
  + tool-aware guidance
  + caller system_message
  + built-in memory / USER.md
  + external memory provider prompt
  + skills prompt
  + context files: AGENTS.md / .cursorrules 等
  + conversation start timestamp / session_id / model / provider
  + platform hint

其中一个细节很关键：ephemeral_system_prompt 不在 _build_system_prompt() 里拼。源码注释说明它只在 API-call time 注入，不进入缓存和持久化的系统提示。

人话解释：稳定的东西放在 session prompt 快照里，临时的东西只发给这次模型调用。这样既能省 token cache 成本，也不会污染 transcript。

技术名词：session-scoped immutable system prompt snapshot + API-call-time ephemeral injection。

#5.2 主循环不是一个简单 while

run_agent.py:run_conversation() 一次 turn 做的事很多：

准备阶段
  - 安装 safe stdio
  - 确保 SessionDB session
  - 设置日志 session context
  - 恢复 primary runtime
  - 清理 surrogate 字符
  - 重置 retry counters / iteration budget / stream scrubbers
  - 添加用户消息

上下文阶段
  - 恢复或构造 cached system prompt
  - preflight context compression
  - memory manager prefetch
  - plugin pre_llm_call
  - 构造 api_messages copy

模型阶段
  - 加 system message
  - 加 prefill messages
  - prompt cache marker
  - sanitize orphan tool results
  - normalize whitespace / tool-call JSON
  - transport.build_kwargs()
  - streaming 或 non-streaming API call

工具阶段
  - 校验 tool name
  - 修复或拒绝 invalid JSON
  - deduplicate / cap delegate_task
  - append assistant(tool_calls)
  - 执行 tool handlers
  - append role=tool results

收尾阶段
  - final response
  - token/cost/cache stats
  - context compressor usage update
  - persist messages
  - cleanup per-turn resources

主循环里大量代码看起来像“兼容补丁”，例如 tool-call JSON 修复、orphan tool result 清理、thinking-only assistant turn 删除、provider empty response retry、finish_reason length continuation。它们背后的原因是 Hermes 同时支持很多 provider 和协议，模型返回格式并不总是理想。

#5.3 工具调用在主循环中被严格校验

当模型返回 tool calls 后，run_agent.py 不会直接执行。它先做：

工具名是否在 self.valid_tool_names。
能否 auto-repair 常见工具名错误。
参数是否是 JSON。
参数被截断时拒绝执行，避免半截命令或半截 patch 造成破坏。
对 delegate_task 做数量上限和去重。
区分 housekeeping 工具和实质工具，决定是否静默后续输出。

真正分发时，普通工具走 model_tools.py:handle_function_call()，再进 tools/registry.py:ToolRegistry.dispatch()。某些 agent-level 工具或 context compressor 工具会在 run_agent.py 内部分流。

#5.4 `AIAgent` 的优点和代价

优点是所有入口共享同一套行为：CLI、Gateway、Cron、TUI、ACP、子 Agent 都能得到同样的 provider fallback、工具执行、memory、compression 和持久化语义。

代价是 AIAgent 已经是 orchestration hotspot。它同时关心 provider、transport、tool loop、session persistence、prompt cache、compression、plugin hooks、streaming、interrupt、usage accounting。后续如果重构，最值得拆的是：

当前职责	建议边界	原因
API retry/fallback	`agent/api_call_manager.py`	provider 错误处理可独立测试
message sanitize/repair	`agent/message_sanitizer.py`	纯函数多，适合单测
prompt cache policy	`agent/prompt_caching.py` 扩展	现在 policy 在 `run_agent.py`，marker 在 `prompt_caching.py`
token/cost/cache stats	`agent/usage_accounting.py`	减少主循环噪音
tool loop state machine	`agent/tool_loop.py`	AIAgent 只保留调度角色

#6. Provider 与 Transport：服务商和协议不是一回事

Hermes 把“服务商是谁”和“API 协议长什么样”拆开，这是模型层最重要的设计。

人话解释：

Provider 是你打给谁：OpenAI、Anthropic、OpenRouter、Bedrock、自定义 endpoint。
Transport 是你怎么说话：OpenAI Chat Completions、Anthropic Messages、Codex Responses、Bedrock Converse。

技术名词：ProviderProfile + ProviderTransport separation。

#6.1 ProviderProfile

providers/base.py:ProviderProfile 是 provider 的声明式描述，包含：

name
api_mode
env_vars
base_url
default_headers
fixed_temperature
default_max_tokens
default_aux_model
prepare_messages()
build_extra_body()
build_api_kwargs_extras()
fetch_models()

providers/__init__.py 负责 provider registry 和懒加载。它支持：

repo 内 plugins/model-providers/<name>/
用户 $HERMES_HOME/plugins/model-providers/<name>/
legacy providers/<name>.py

后注册可以覆盖先注册，因此用户 profile 可以覆盖 bundled provider。

#6.2 ProviderTransport

agent/transports/base.py:ProviderTransport 定义协议层合同：

convert_messages()
convert_tools()
build_kwargs()
normalize_response()
validate_response()
extract_cache_stats()
map_finish_reason()

agent/transports/chat_completions.py:ChatCompletionsTransport 是最常用路径。它会把 Hermes 内部 messages/tools 转成 OpenAI-style kwargs；如果传入 provider profile，就走 profile hooks，把 provider quirks 尽量移出主循环。

#6.3 Credential Pool

agent/credential_pool.py 处理多凭证轮转。关键类是：

PooledCredential
CredentialPool
mark_exhausted_and_rotate()

为什么 credential pool 不是简单 API key 数组？因为 Hermes 要区分 provider/custom endpoint、处理 exhausted TTL、OAuth token sync、pool strategy，以及不同 provider 的错误类型。凭证轮转还会影响 client、transport、compressor 和 fallback 状态，所以它既接近模型层，又被 run_agent.py 使用。

#6.4 剩余耦合

新增 OpenAI-compatible provider 通常只需要加 ProviderProfile。但新增全新 api_mode 仍可能要碰：

agent/transports/<mode>.py
agent/transports/__init__.py
hermes_cli/runtime_provider.py
run_agent.py 中 api_mode 判断、streaming、normalize、fallback、usage 分支

这部分插件化边界还需要进一步核验，尤其是全新协议的 streaming 和 tool-call normalize 合同。

#7. 工具系统：模型能看到哪些按钮，按钮按下后怎么执行

Hermes 的工具系统是三层：

tools/*.py / plugin / MCP
  -> tools/registry.py:ToolRegistry
  -> toolsets.py:能力包
  -> model_tools.py:get_tool_definitions()
  -> run_agent.py 发给模型
  -> model_tools.py:handle_function_call()
  -> registry.dispatch()

#7.1 `tools/registry.py`: 注册表

tools/registry.py:ToolRegistry 存每个工具的 ToolEntry，包括 schema、handler、toolset、check_fn、requires_env、async 标记、dynamic schema overrides。

关键函数：

register()：工具注册。内置工具通常在模块 import 时注册。
get_definitions()：按工具名返回 OpenAI-format schema，并执行 check_fn。
dispatch()：执行 handler，捕获异常并返回 JSON error。
tool_result() / tool_error()：统一工具返回格式。

源码里有几个很务实的设计：

check_fn 结果有短 TTL，避免每轮反复探测 Docker、Playwright、环境变量。
registry 有 _generation 计数器，插件/MCP 刷新工具后能让 schema cache 失效。
非 MCP 工具不允许互相 shadow，避免插件覆盖内置工具。
async handler 通过 _run_async() 桥接。

#7.2 `toolsets.py`: 能力包

toolsets.py 定义能力包，例如：

web
vision
terminal
file
browser
skills
memory
cronjob
messaging
delegate
hermes-cli
messaging platform 相关 toolset

人话解释：toolset 是“模型这次能看到的一组按钮”。同一个 Hermes 安装可以有很多工具，但某个入口或平台不一定应该暴露所有工具。

技术名词：capability gating。

#7.3 `model_tools.py`: schema 和 dispatch 薄层

model_tools.py:get_tool_definitions() 会根据 enabled/disabled toolsets 解析最终工具名集合，并用 registry 返回 schema。它还有 quiet-mode cache，cache key 包括：

enabled toolsets
disabled toolsets
registry._generation
config 文件 mtime/size

这说明 Hermes 对 Gateway 热路径做过优化：每条平台消息都重新算完整 schema 会很贵。

model_tools.py:handle_function_call() 做普通工具统一入口：

按 schema coercion 参数类型。
阻止 agent-loop tools 走普通 dispatch。
触发 pre_tool_call plugin hook，可被插件 block。
调用 registry dispatch。
触发 transform_tool_result hook。

#7.4 核心工具模块

模块	负责什么	为什么需要	先看哪里
`tools/file_tools.py`	读写、patch、搜索文件	Agent 编码任务基础能力	schema 注册、path 安全、patch 逻辑
`tools/terminal_tool.py`	本地/Docker/Modal/Daytona/Vercel/SSH 命令执行	运行测试、安装依赖、后台进程	`terminal()`、sandbox 创建、approval 调用
`tools/process_registry.py`	长进程状态、输出缓冲、恢复	命令可能跨 turn 完成	`ProcessSession`、checkpoint recovery
`tools/browser_tool.py`	浏览器自动化	Web 交互不是纯抓取	adapter/check_fn、URL 安全
`tools/code_execution_tool.py`	在沙箱内程序化调用工具	让模型批量调用白名单工具	strict/project mode、RPC transport
`tools/delegate_tool.py`	同步子 Agent	并行阅读/实现/验证	`_build_child_agent()`、`delegate_task()`
`tools/cronjob_tools.py`	让模型创建/管理 cron job	后台任务由 Agent 自己安排	`cronjob()` schema 和 create/update 分支
`tools/mcp_tool.py`	外部 MCP server 工具接入	把 MCP 生态折叠成 Hermes tool	`MCPServerTask`、`_register_server_tools()`

#8. Prompt caching：Hermes 架构里最值得学习的一条暗线

Prompt caching 的目标是让长会话里稳定的前缀被 provider 复用，降低成本和延迟。难点是缓存要求前缀尽量 bit-perfect。

Hermes 用四层保证这件事：

稳定系统提示
  run_agent.py:_build_system_prompt()
  hermes_state.py:sessions.system_prompt

临时上下文不污染前缀
  run_agent.py:ephemeral_system_prompt
  memory manager prefetch
  plugin pre_llm_call

显式 cache marker
  run_agent.py:_anthropic_prompt_cache_policy()
  agent/prompt_caching.py:apply_anthropic_cache_control()

可观测 cache stats
  agent/transports/*.extract_cache_stats()
  run_agent.py canonical usage
  hermes_state.py token/cache columns

agent/prompt_caching.py:apply_anthropic_cache_control() 会复制 API messages，然后在 system prompt 和最后几条非 system message 上加 cache breakpoints。run_agent.py:_anthropic_prompt_cache_policy() 决定 native Anthropic、OpenRouter、第三方 Anthropic-compatible、MiniMax、Alibaba/Qwen 等路径是否启用以及使用哪种布局。

这条设计带来的一个用户可见取舍是：旧 session 不一定自动看到刚修改的 memory、skill 或 context 文件，因为 system prompt 是 session 快照。Hermes 用这个代价换稳定 prefix cache。Gateway 的 _agent_cache 也是同一目标：复用 live AIAgent，不要每条消息都重建系统提示和 provider session state。

#9. Skills、Memory 与上下文管理

Hermes 里有好几种“记忆”，它们不是一回事。

类型	人话解释	技术位置	生命周期
System prompt	这场会话的规则和身份	`run_agent.py:_build_system_prompt()`	session 快照
Context files	项目规则，如 `AGENTS.md`	`agent/prompt_builder.py:build_context_files_prompt()`	构造 system prompt 时读取
Skills	可按需展开的操作手册	`tools/skills_tool.py`、`agent/skill_commands.py`	文件持久化，prompt 里放索引
Built-in memory	稳定事实/用户资料	`tools/memory_tool.py`	文件持久化，进入 system prompt 快照
External memory provider	外部召回/长期记忆	`agent/memory_manager.py`	system block + prefetch ephemeral
Session search	过去聊天记录检索	`tools/session_search_tool.py`、`hermes_state.py`	SQLite FTS
Context compression	当前会话太长时压缩	`agent/context_compressor.py`	有损摘要，可能创建 child session
Subdirectory hints	局部目录规则	`agent/subdirectory_hints.py`	懒加载

#9.1 Skills 是能力包，不是普通文档

tools/skills_tool.py 提供 skills_list、skill_view、skill_manage。系统提示不会把所有 skill 全文塞进去，而是通过 agent/prompt_builder.py:build_skills_system_prompt() 放入索引和使用规则，需要时再由模型调用 skill_view 展开。

人话解释：Skills 像工具书目录，不像把整本书塞进 prompt。

技术名词：progressive disclosure。

agent/skill_commands.py 还把技能变成 slash command，使 CLI/Gateway 可以直接触发某个 skill 工作流。

#9.2 Memory 分内置和外部

tools/memory_tool.py 管内置 memory 文件。run_agent.py:_build_system_prompt() 会读取 memory 和 USER.md，冻结进 session system prompt。

agent/memory_manager.py 管外部 memory provider。它有两种进入模型的方式：

build_system_prompt()：稳定的 provider 描述或重要记忆块。
prefetch_all()：每轮根据当前用户消息召回，注入到当前 user message 的 API copy，不写入 transcript。

这种边界很重要：长期事实可以进入 system snapshot，按问题召回的上下文不应该污染历史。

#9.3 Context compression 是有损状态管理

agent/context_compressor.py:ContextCompressor 不是粗暴删消息。它会：

用 SUMMARY_PREFIX 标注摘要只是 reference，不是 active instructions。
保护开头 protect_first_n 和结尾 protect_last_n。
先 _prune_old_tool_results() 裁剪旧工具输出。
_summarize_tool_result() 针对 terminal/read_file/search/browser/delegate 等工具做保留重点的摘要。
按 token budget 保护 tail。
update_model() 在切模型后重算 context length 和阈值。
用 anti-thrashing 避免压缩收益很小时反复压缩。

run_agent.py 会在 turn 前做 preflight compression。gateway/run.py 还有 session hygiene，在创建 Agent 前处理过大的 transcript。压缩可能创建新的 session 并通过 parent_session_id 关联，这就是为什么会话持久化和 Gateway session mapping 必须理解 compression lineage。

#10. SessionDB 与持久化：SQLite 不只是聊天记录

hermes_state.py:SessionDB 是 Hermes 的长期状态中心。它负责：

sessions 表：session metadata、model/provider、system prompt、token/cost/cache counters、parent session。
messages 表：role/content/tool_call/reasoning/codex fields。
messages_fts：全文搜索。
messages_fts_trigram：substring/trigram 搜索体验。
compression lineage：parent_session_id。
token accounting：prompt/completion/cache read/write 等计数。
Telegram topic bindings 等 Gateway 相关持久化。

源码里有很多生产级细节：

WAL 模式支持多读单写。
WAL 在 NFS/SMB/FUSE 等文件系统失败时降级到 DELETE journal。
database is locked 有 retry。
_reconcile_columns() 根据 CREATE TABLE 定义和 live columns 做轻量 migration。
finalize_orphaned_compression_sessions() 清理压缩中断后的孤儿 session。

人话解释：hermes_state.py 管的是“对话事实本身”。它不是 Gateway 的平台坐标索引。

gateway/session.py:SessionStore 管的是另一层：某个平台的某个聊天位置对应哪个 session_id。它用 session_key 做坐标：

SessionSource(platform, chat_id, thread_id, user_id, chat_type)
  -> build_session_key()
  -> SessionEntry(session_key, session_id, origin, reset policy, resume_pending...)
  -> hermes_state.py:sessions.id

这就是 Hermes 有两套状态的原因：

SQLite：保存 Agent 对话和可搜索历史。
Gateway sessions index：保存平台聊天位置到对话 ID 的映射，以及 reset/resume/pending 状态。

#11. Gateway：把多平台消息翻译成同一个 Agent loop

gateway/run.py:GatewayRunner 是 Gateway 的大脑。它很重，但它处理的现实问题也最多。

#11.1 平台统一模型

gateway/platforms/base.py 定义统一平台层：

MessageType
MessageEvent
SendResult
BasePlatformAdapter

MessageEvent 包含 text、message_type、source、media_urls、reply context、auto_skill、channel_prompt、internal、timestamp 等字段。平台 adapter 只要把 Telegram/Discord/Feishu/Slack/Webhook 等不同事件翻译成 MessageEvent，后面就可以走统一 Gateway 流程。

BasePlatformAdapter 管 message handler、active sessions、pending messages、background tasks、post-delivery callbacks、typing paused、TTS 开关等通用运行时状态。

#11.2 GatewayRunner 的控制流

gateway/run.py:GatewayRunner._handle_message() 负责入口分流：

pre_gateway_dispatch plugin hook
  -> user authorization / pairing
  -> pending update prompt
  -> slash confirm
  -> running-agent approve/deny/yolo fast path
  -> busy session interrupt/queue/steer
  -> slash command
  -> plugin command
  -> skill command
  -> normal agent path

真正进入 Agent 的逻辑在 _handle_message_with_agent()：

SessionStore.get_or_create_session()
Telegram topic lane binding
auto-reset notice
build_session_context_prompt()
PII redaction 判断
auto skill / channel prompt
history 加载
session hygiene
创建或复用 AIAgent
注册 approval callback
agent.run_conversation()
发送结果
清理 resume_pending

#11.3 平台上下文如何进入提示词

gateway/session.py:build_session_context_prompt() 会生成 ## Current Session Context，包括来源平台、用户、聊天类型、thread、connected platforms、delivery options，以及平台特殊说明。

插件平台还能在 gateway/platform_registry.py:PlatformEntry.platform_hint 里注册平台提示。第 14 轮遗留点已核验：run_agent.py 在构造 system prompt 时会查 platform_registry，如果当前 platform 有 platform_hint，就追加到 prompt。内置/插件 adapter 仍需要正确设置 AIAgent.platform 才能命中这条路径。

#11.4 Gateway 的复杂度来源

Gateway 必须解决这些问题：

用户授权和 pairing。
多平台消息格式差异。
同一 session 正在运行时，新消息是 interrupt、queue 还是 steer。
/approve 同时可能是危险命令审批，也可能是 destructive slash confirm。
Gateway 重启时正在跑的 Agent 要 drain，超时则 resume_pending。
平台 adapter 失败后后台重连。
_agent_cache 要保护 prompt cache，但不能无限增长。
/new、/reset 要清理 session-scoped model/reasoning/approval/yolo/queue 状态。

因此 gateway/run.py 是第二个 orchestration hotspot。后续可拆成 command dispatcher、active session controller、agent cache manager、adapter supervisor、session hygiene service。

#12. 平台 Adapter 案例

平台	主要源码	复杂点	与核心交互
Telegram	`gateway/platforms/telegram.py`	topic、reply anchor、相册、文件下载、offset/restart	生成 `MessageEvent`，send 时带 thread metadata
Discord	`gateway/platforms/discord.py`	slash command、thread、forum、voice、button approval	普通消息和 interaction 都归一到 Gateway
Feishu	`gateway/platforms/feishu.py`	富文本、卡片、websocket/webhook、reply_in_thread	normalize rich payload，出站组装 text/post/card/file/image
Webhook	`gateway/platforms/webhook.py`	外部 HTTP 唤醒、auth-before-body、rate limit、dedupe	可走 Agent mode，也可 `deliver_only` 跳过 Agent
API Server	Gateway 相关 adapter/module	OpenAI-compatible HTTP 门面	把 HTTP messages 变成 Hermes session/history/user turn
Plugin Platform	`gateway/platform_registry.py` + plugins	setup/auth/cron/send_message/PII/hint metadata	通过 `PluginContext.register_platform()` 注册

平台 adapter 的边界应该是：做平台 I/O、媒体缓存、消息格式归一化、出站分片和平台特有 metadata。不要在 adapter 里实现 Agent 行为。

#13. Cron、Webhook 与后台任务

Hermes 的异步能力有三条主线：定时任务、外部事件唤醒、长进程完成通知。

#13.1 Cron

Cron 不是系统 crontab，而是 Gateway 内部调度器。

关键源码：

cron/jobs.py：文件型 job 数据库，支持 create/list/update/remove。
tools/cronjob_tools.py:cronjob()：模型可调用的 cron 管理工具。
cron/scheduler.py:tick()：扫描 due jobs，文件锁防并发 tick。
cron/scheduler.py:run_job()：执行 job。
gateway/run.py:_start_cron_ticker()：Gateway 进程内定时 tick。

Cron job 有几种模式：

普通 Agent job：prompt/skills/script output -> AIAgent。
script + wake gate：脚本先跑，输出 JSON {"wakeAgent": false} 可跳过 Agent。
no_agent=True：脚本就是任务，stdout 直接投递。
context_from：读取其他 job 的最近输出作为上下文。

Cron Agent 会复用 run_agent.py:AIAgent，但运行环境有意和 live session 分开，避免后台任务污染当前聊天。

#13.2 Webhook

gateway/platforms/webhook.py:WebhookAdapter 启动 aiohttp HTTP 服务。它支持 static routes 和 dynamic subscriptions，处理 auth-before-body、限流、去重、模板渲染。

Webhook 有两种模式：

Agent mode：外部事件变成 MessageEvent，进入 Gateway/AIAgent。
deliver_only：渲染后的 prompt 直接投递，不跑模型。

#13.3 后台进程

tools/terminal_tool.py 支持后台运行，tools/process_registry.py 记录 ProcessSession、输出缓冲、完成状态和 checkpoint recovery。进程完成后，Gateway 可以通过 synthetic/internal event 通知 Agent 或用户。

这条链路解释了为什么 gateway/platforms/base.py:MessageEvent 有 internal 字段：系统生成的事件必须绕过用户授权，但仍走统一投递/会话路径。

#14. Plugins 与 MCP：扩展最后都要合流

#14.1 Plugin 是进程内扩展系统

hermes_cli/plugins.py 定义 PluginContext 和 plugin manager。插件可以：

register_tool()：注册工具到 tools.registry。
register_platform()：注册 Gateway 平台到 gateway.platform_registry。
注册 slash command、hooks、setup/auth helpers 等。

关键 hook 包括：

pre_gateway_dispatch
pre_llm_call
pre_tool_call
transform_tool_result
approval 相关 hook

这些 hook 分布在 gateway/run.py、run_agent.py、model_tools.py、tools/approval.py 等位置。能力很强，但 contract 也比较分散。插件作者要特别记住：pre_llm_call 返回的上下文会被注入当前 user message 的 API copy，不应该修改 stable system prompt。

#14.2 MCP 是外部工具生态的折叠层

tools/mcp_tool.py 把 MCP server 变成 Hermes tool。它支持：

stdio
HTTP/StreamableHTTP
SSE
OAuth/PCKE 相关路径
dynamic notifications/tools/list_changed
resources/prompts utility tools
sampling/createMessage
tool schema 规范化
suspicious description scanning
子进程 stderr 重定向到 mcp-stderr.log

核心对象是 MCPServerTask。连接后，_register_server_tools() 会把 MCP tool 转成 Hermes registry schema，工具名加 mcp_<server>_<tool> 前缀，toolset 形如 mcp-<server>，并注册 alias。动态刷新时 registry _generation 变化，model_tools.py 的 schema cache 随之失效。

Gateway 的 /reload-mcp 在 gateway/run.py:_handle_reload_mcp_command() 和 _execute_mcp_reload()，会关闭并重新发现 MCP server，同时清理 cached Agent，因为工具面变化会影响 prompt cache。

#15. TUI、ACP、Delegation 与 Kanban

#15.1 TUI

tui_gateway/server.py 是本地 JSON-RPC gateway。它保留 stdout 给 JSON-RPC，把 Python stdout 重定向到 stderr，避免污染 TUI 协议流。

关键路径：

method registry 处理 JSON-RPC。
_build_agent / 相关构造逻辑创建 AIAgent。
prompt.submit 对应 @method("prompt.submit")。
_run_prompt_submit() 在线程里调用 agent.run_conversation()。
如果 compression 让 agent.session_id 变化，TUI 会 re-anchor session key。
TUI 还通过 tools/delegate_tool 暴露子 Agent 可观测性。

人话解释：TUI 不是新 Agent，它是本地 UI 到 AIAgent 的协议桥。

#15.2 ACP

acp_adapter/server.py 把 Hermes 包装成编辑器/Agent Client Protocol 风格的服务。核心类在源码中是 ACP Agent implementation，内部用 SessionManager 和 SessionState 管会话。

关键方法：

initialize()
new_session()
load_session()
resume_session()
fork_session()
list_sessions()
prompt()
set_session_model()
set_session_mode()
slash command handlers，如 _cmd_model()、_cmd_tools()、_cmd_compact()、_cmd_queue()。

prompt() 最终也在线程池中调用 agent.run_conversation()。ACP 还支持 session-scoped MCP registration，注册后会刷新 tool surface。

#15.3 Delegation

tools/delegate_tool.py 把“子 Agent”做成一个工具。delegate_task() 会构造子 AIAgent，给它独立上下文、受限 toolsets、角色和深度限制。它支持：

leaf / orchestrator role。
max spawn depth。
max concurrent children。
子 Agent heartbeat/progress/tool events。
将子 Agent cost/summary 汇总回父 Agent。

它是当前 turn 内同步多 Agent，不等同于 Kanban。

#15.4 Kanban

hermes_cli/kanban.py 和相关 kanban DB 模块实现持久任务板。Kanban 是跨进程、持久、多 Agent 的任务系统。CLI/Gateway 通过 /kanban 共用同一套 argparse 执行路径，Gateway 内可运行 dispatcher。

一句话区分：

delegate_task：当前 turn 内派生子 Agent，结果回到父 Agent。
Kanban：持久任务队列，任务可跨进程、跨时间推进。

#16. 安全边界：不是一个模块，而是一组防线

Hermes 的安全逻辑分布在入口、工具、执行环境、日志和投递层。

入口权限
  gateway/run.py:_is_user_authorized()
  pairing store
  platform allowed_users_env

工具暴露
  toolsets.py
  tools/registry.py check_fn

危险命令
  tools/approval.py
  tools/terminal_tool.py
  gateway/run.py /approve /deny /yolo

执行隔离
  tools/terminal_tool.py
  Docker / Modal / Daytona / Vercel Sandbox / SSH / local

URL 与浏览器
  tools/browser_tool.py
  private URL / SSRF 保护

日志与输出脱敏
  hermes_logging.py
  agent.redact
  terminal output redaction

平台隐私
  gateway/session.py:build_session_context_prompt(redact_pii=True)

#16.1 Approval/yolo

tools/approval.py 是危险命令审批的 single source of truth。它包含：

hardline blocklist：即使 yolo 也不能绕过。
dangerous pattern detection。
session-scoped approval。
permanent approval。
session yolo。
Gateway blocking approval queue。
contextvars 绑定当前 session key。

Gateway 的 /approve 和 /deny 在 gateway/run.py 中处理，会调用 resolve_gateway_approval() 解锁阻塞中的工具线程。重要细节：Gateway 审批不是“告诉模型用户同意了”，而是工具线程真的在 tools/approval.py 里阻塞等待事件。

#16.2 Sandbox

tools/terminal_tool.py 支持多执行后端。本地执行会经过危险命令检查，容器/云 sandbox 有不同边界。源码中对 task_id 到 sandbox key、创建锁、清理线程、后台进程保活都有处理。

人话解释：Hermes 不是看到 terminal tool 就裸 subprocess.run()。它会先决定在哪个环境跑、是否危险、是否要审批、如何保存后台状态、输出是否脱敏。

#16.3 Redaction 和 PII

hermes_logging.py 和 agent.redact 负责日志/工具输出密钥脱敏。Gateway 启动时会读取安全配置并设置 HERMES_REDACT_SECRETS。

gateway/session.py:build_session_context_prompt() 支持 PII redaction。它只对适合脱敏的平台启用，例如 Telegram/Signal/WhatsApp/BlueBubbles 这类不需要真实 mention ID 的平台；Discord 这类 mention 需要真实 ID 的平台不能简单哈希。

#17. 模块依赖图

hermes_cli/main.py
  -> hermes_cli/config.py
  -> hermes_cli/runtime_provider.py
  -> cli.py:HermesCLI
      -> run_agent.py:AIAgent

gateway/run.py:GatewayRunner
  -> gateway/platforms/base.py
  -> gateway/platforms/<platform>.py
  -> gateway/session.py:SessionStore
  -> gateway/delivery.py
  -> hermes_cli/runtime_provider.py
  -> run_agent.py:AIAgent

run_agent.py:AIAgent
  -> model_tools.py
      -> toolsets.py
      -> tools/registry.py
      -> tools/*.py
  -> providers/base.py / providers/__init__.py
  -> agent/transports/*
  -> agent/credential_pool.py
  -> agent/prompt_builder.py
  -> agent/context_compressor.py
  -> agent/memory_manager.py
  -> hermes_state.py:SessionDB
  -> hermes_cli/plugins.py hooks

tools/mcp_tool.py
  -> tools/registry.py
  -> model_tools.py cache invalidation via registry generation

cron/scheduler.py
  -> run_agent.py:AIAgent
  -> gateway/delivery.py / adapters

#18. 常见坑

坑	为什么会发生	读哪里
改了 memory/skill，旧 session 没变化	system prompt 是 session 快照，为 prompt cache 稳定服务	`run_agent.py:_build_system_prompt()`、`sessions.system_prompt`
Gateway 一条消息没进模型	可能被 auth、pairing、slash、busy、plugin hook、approve fast path 截走	`gateway/run.py:_handle_message()`
`/approve` 行为看起来冲突	危险命令审批优先于 slash confirm	`gateway/run.py` pending confirm 逻辑、`tools/approval.py`
新工具注册了但模型看不到	toolset 没启用、check_fn 失败、schema cache 还没失效	`toolsets.py`、`model_tools.py`、`tools/registry.py`
新 MCP 工具不刷新	MCP server 未连接、`tools/list_changed` 未触发、需要 `/reload-mcp`	`tools/mcp_tool.py`、`gateway/run.py:_execute_mcp_reload()`
Gateway 重启后会话“接着说”	`resume_pending` 或新消息前有 fresh tool tail	`gateway/session.py`、`gateway/run.py` resume logic
压缩后 session_id 变了	compression lineage 用 `parent_session_id` 串起来	`agent/context_compressor.py`、`hermes_state.py`
新 provider 可以 profile 化，新 api_mode 却要改 core	provider 和协议抽象成熟度不同	`providers/base.py`、`agent/transports/*`、`run_agent.py`
tool result 太大导致上下文爆炸	工具有结果大小限制，compressor 也会裁剪旧 tool output	`tools/registry.py`、`agent/context_compressor.py`

#19. 如果要扩展 Hermes，从哪里下手

#加一个普通 tool

先读：

tools/registry.py
model_tools.py
toolsets.py
一个相近 tools/*.py

步骤：

写 schema 和 handler。
用 registry.register() 注册。
放进合适 toolset，或注册 plugin toolset。
确认 check_fn 不会在热路径里太慢。
用 CLI/Gateway 都跑一次，因为两者 enabled toolsets 可能不同。

#加一个 provider

如果是 OpenAI-compatible：

写 ProviderProfile。
在 provider plugin 或 providers 中注册。
配置 base_url/env_vars/default_headers。
尽量用 ChatCompletionsTransport 的 profile path。

如果是全新协议，需要进一步核验 streaming、tool calls、usage、fallback、credential rotation 的完整合同，大概率要新增 transport 并改 run_agent.py 若干分支。

#加一个平台

优先走 plugin platform：

hermes_cli/plugins.py:PluginContext.register_platform()
gateway/platform_registry.py:PlatformEntry
gateway/platforms/base.py:BasePlatformAdapter

平台 adapter 需要实现 connect/close/send，并把入站事件转成 MessageEvent。如果平台要支持 cron delivery、standalone send、PII 策略、setup/auth、平台提示，也要填 PlatformEntry 的 metadata。

#加一个 skill

读：

tools/skills_tool.py
agent/skill_commands.py
agent/prompt_builder.py

Skill 是 Markdown 能力包。写 skill 时应该把正文当作“模型可执行的操作规程”，而不是给人看的散文。需要工具时，在 skill 里明确触发条件和文件路径。

#加 slash command

先看中央命令定义，再看双执行路径：

hermes_cli/commands.py
cli.py 中 CLI handler
gateway/run.py 中 Gateway handler

注意 Gateway 的 fast path 优先级：/approve、/deny、/yolo、/stop、/new 等命令在 agent running 时有特殊路径。

#20. 架构优缺点

维度	优点	代价
共享主循环	多入口行为一致，工具/provider/memory/compression 复用	`run_agent.py` 过大
Provider 抽象	新 provider 多数可 profile 化	新 api_mode 仍耦合主循环
工具系统	registry/toolset/schema cache 分层清楚	cache invalidation 和动态 schema 复杂
Prompt caching	system snapshot + ephemeral injection 设计成熟	用户修改上下文不一定影响旧 session
持久化	SQLite + FTS + WAL retry + compression lineage 扎实	SQLite、SessionStore JSON、内存 cache 多状态并存
Gateway	多平台统一到 `MessageEvent`，恢复/中断/队列完整	`GatewayRunner` 是第二个巨型控制器
Cron/Webhook	后台任务复用 Agent，又支持跳过 Agent 的轻量模式	origin/delivery/session 语义需要仔细区分
Plugin/MCP	扩展面丰富，最终合流到 registry	hook contract 分散，MCP 启动/刷新复杂
安全	approval、sandbox、redaction、PII 多层防线	安全审计不能只看一个文件

#21. 最终理解：Hermes 的核心取舍

Hermes 的设计取舍可以概括成三句话：

多入口收敛到一个 Agent loop。 这让 CLI、Gateway、Cron、TUI、ACP、Delegation 共享能力，但也让 run_agent.py 承担巨大编排压力。
扩展能力通过 registry 合流。 Tool、Provider、Platform、MCP、Plugin 都尽量注册到统一表里，再由核心运行时消费。这让扩展变得自然，但 schema/cache/hook 失效逻辑变复杂。
长会话是第一等公民。 SessionDB、prompt snapshot、prompt caching、context compression、resume_pending、agent cache、FTS/search 都是为了让 Agent 在真实长期使用中不断线、不爆上下文、可恢复、可追溯。

如果只想快速改一个功能，先找 registry 和边界模块，不要一上来改 run_agent.py 或 gateway/run.py。如果想彻底理解 Hermes，就从 AIAgent.run_conversation() 和 GatewayRunner._handle_message() 两条控制流开始，把 Provider、Tool、Session、Prompt、Security 逐个挂上去。最终你会看到：Hermes 不是把 LLM 接到命令行这么简单，而是在把 LLM 变成一个可长期运行、可扩展、可恢复、能跨平台工作的软件系统。

#22. 仍需要进一步核验的点

本文已覆盖核心源码路径，但有几处仍建议在后续专题里继续核验：

新增全新 api_mode 的完整插件化合同，尤其 streaming、usage、tool-call normalize、fallback 和 credential rotation。
API server adapter 的完整 HTTP schema 兼容细节。
Kanban DB/dispatcher 的数据库 schema 和 worker 生命周期可单独展开。
MCP OAuth、sampling 和 session-scoped MCP 在不同客户端版本下的兼容矩阵。
run_agent.py 与 gateway/run.py 的建议拆分只是架构评价，未实际验证重构可行性。