MCP (Model Context Protocol)

Overview

MCP（Model Context Protocol）是 Anthropic 於 2024 年 11 月提出的標準化協定，旨在連接 LLM Agent 與外部工具（如檔案系統、Git、資料庫、Slack、GitHub 等），現已被 OpenAI、Google、Microsoft 等主要廠商採納 ^[raw/papers/tool-attention-mcp-tax.md]。MCP 的核心價值在於提供統一的工具介面，將過去 N × M 個客製化整合簡化為 N + M 的可組合表面——每個 Agent 用戶端可以透過標準化的 JSON-RPC 2.0 握手發現並調用任何合規伺服器暴露的工具 ^[raw/papers/tool-attention-mcp-tax.md]。

然而，MCP 的 stateless 設計帶來了顯著的系統性開銷——由於底層 chat-completions API 的無狀態特性，主機用戶端（Claude Desktop、Cursor、VS Code、Claude Code）必須在每次對話回合重新序列化整個工具目錄 ^[raw/papers/tool-attention-mcp-tax.md]。每次請求都必須重新注入完整工具綱要（JSON Schema），造成每次來回消耗 15k–55k 個 token 的隱藏開銷，社群稱之為 MCP Tax 或 Tools Tax ^[raw/papers/tool-attention-mcp-tax.md]。

這種 stateless re-injection 在典型 4-6 伺服器部署中每次回合產生 15,000–55,000 tokens 的開銷，在積極的工具擴展場景下甚至超過 150k tokens ^[raw/papers/tool-attention-mcp-tax.md]。The Tools Tax 會造成三種級聯失敗：經濟上使每個會話支出增加一個數量級；認知上當上下文利用率超過約 70% 時 LLM 推理品質崩潰；安全上使 Tool Poisoning Attack 成為可能 ^[raw/papers/tool-attention-mcp-tax.md]。

Core Contributions

MCP 的核心貢獻在於建立一個開放規範，標準化 LLM 與外部工具之間的交換格式 ^[raw/papers/tool-attention-mcp-tax.md]。主要特點包括：

貢獻	說明
統一介面	透過 JSON-RPC 2.0 握手 discover 並 call 任何合規伺服器的工具
N + M 可組合性	將過去 O(N×M) 的整合複雜度降至 O(N+M)
跨平台採用	Anthropic、OpenAI、Google、Microsoft 均已採納
生態擴展	支援 Filesystem、Git、Database、Slack、GitHub 等多種伺服器類型

然而，Anthropic 承認其方法在 2025 年 11 月的更新中指出，code-execution pattern 可在資料密集的工作流程中實現高達 98.7% 的 token 減少，但這是對工具輸出的優化，而非對工具定義本身的優化 ^[raw/papers/tool-attention-mcp-tax.md]。

Protocol Architecture

3.1 協定機制

令 M = {t₁, …, tₙ} 為某個 Agent 主機在會話時連接的所有 MCP 伺服器暴露的工具集合。每個工具 tᵢ 由一個四元組描述：(nameᵢ, descᵢ, schemaᵢ, outputᵢ)，其中 schema 是列舉類型參數、描述、枚舉和 required/optional 標誌的 JSON Schema 物件 ^[raw/papers/tool-attention-mcp-tax.md]。

令 τᵢ 表示序列化工具定義的 tokenized 長度（在模型的 tokenizer 下，通常是 cl100k_base）：

τᵢ = τᵢ^name + τᵢ^desc + τᵢ^schema + τᵢ^output ^[raw/papers/tool-attention-mcp-tax.md]

在 naive MCP 注入下，K 回合對話的每個回合都會重新序列化所有 N 個定義。因此 per-session Tools Tax 為：

T_tax(N, K) = K · Σᵢ₌₁ⁿ τᵢ ≈ K · αN + Σᵢ₌₁ⁿ |descᵢ|^chars ^[raw/papers/tool-attention-mcp-tax.md]

其中右側近似遵循社群經驗法則：name、desc 和完整 schema 總和後每個工具約 200–500 tokens ^[raw/papers/tool-attention-mcp-tax.md]。

3.2 上下文視窗與利用率

令 C_max 為模型的標稱上下文視窗，C_task(K) 為在第 K 回合對任務真正有用的 tokens（用戶消息、助理思考、工具輸出）。有效上下文利用率為：

ρ(K) = C_task(K) / (C_task(K) + T_tax(N,K) + C_sys) ^[raw/papers/tool-attention-mcp-tax.md]

實證研究報告當 ρ 降至約 0.3（上下文利用率超過約 70%）時推理品質急遽下降 ^[raw/papers/tool-attention-mcp-tax.md]。模型開始產生參數幻覺、混淆相似工具，並失去多步驟一致性。

3.3 安全外部性：Tool Poisoning Attack

由於每個描述 token 都會被 LLM 的推理循環解析，控制單一工具描述的對抗行為者可以注入指令來劫持 Agent——即使該工具從未被調用過（Tool Poisoning Attack, TPA） ^[raw/papers/tool-attention-mcp-tax.md]。注入的 schema 語料庫越大，攻擊面就越大。

MindGuard 定義了生成 token u（如 tool-call 動作）與上下文元數據 token v 之間的總注意力能量（Total Attention Energy）：

TAE(u,v) = Σₗ₌₁ᴸ Σₕ₌₁ᴴ αₗ,ₕ(u→v)² ^[raw/papers/tool-attention-mcp-tax.md]

MindGuard 的核心觀察：成功的工具調用在生成的動作 tokens 和所選工具 schema 的 tokens 之間積累高 TAE。關鍵是，如果 schema 不在 prompt 中，就無法達到高 TAE ^[raw/papers/tool-attention-mcp-tax.md]。

MCP 生態

MCP 被多個主流工具和平臺採用：

平臺	類型	說明
Claude Desktop	Anthropic 官方桌面應用	原生支援 MCP 工具發現與調用
Cursor	AI 程式碼編輯器	深度整合 MCP 協定
VS Code (via extension)	微軟程式碼編輯器	透過擴展支援 MCP
Claude Code	命令列編碼 Agent	用於終端環境的 Agent 工具呼叫
OpenHands	軟體開發 Agent	社群開發的開源 Agent
企業資料庫部署	106 工具規模	典型企業 MCP 部署配置

GitHub MCP suite（完整配置）包含 93 個工具，每次請求產生約 55,000 tokens 的 Tools Tax ^[raw/papers/tool-attention-mcp-tax.md]。

Key Results

Token Overhead Analysis

Table 1 呈現了三個獨立公開審計中真實 MCP 部署的 per-server token footprints ^[raw/papers/tool-attention-mcp-tax.md]：

伺服器	工具數量	Tokens/turn	200k 占比
Filesystem	8–12	~1,500	0.75%
Git	15–20	~3,000	1.50%
Database	10–15	~2,500	1.25%
Web Search	5–8	~1,200	0.60%
Slack	10–15	~2,000	1.00%
Custom internal	varies	5,000–8,000	2.5–4.0%
GitHub (full)	93	~55,000	27.5%
Enterprise DB	106	~54,600	27.3%
Typical 4-server host	40–60	15k–20k	7.5–10%

這些數字是最小值：假設完美的描述衛生且僅計算工具定義，不包括系統 prompt、對話歷史和中間工具輸出 ^[raw/papers/tool-attention-mcp-tax.md]。

Context Utilization Data

在 120 工具、六伺服器的合成測試平臺上（每伺服器 token counts 根據公開部署審計校準），模擬結果如下 ^[raw/papers/tool-attention-mcp-tax.md]：

方法	Tokens/turn	ρ (Context Utilization)	Success %†	P50 (s)†	Cost/task†
Full-Schema (B1)	47,312	0.24	~72%	4.2s	$0.21
Static Pruning (B2)	11,865	0.56	~58%	3.8s	$0.09
Simple Retrieval (B3)	4,082	0.78	~81%	2.2s	$0.04
CLI Lazy (B4)	480	0.94	~88%	2.4s	$0.03
Tool Attention (ours)	2,368	0.91	~94%	2.0s	$0.03

† 標記為投影值，非 live LLM 測量 ^[raw/papers/tool-attention-mcp-tax.md]。

相較於 naive Full-Schema baseline，Tool Attention 實現了測量到的 95.0% per-turn tool tokens 減少，以及 3.8× 有效上下文利用率提升 ^[raw/papers/tool-attention-mcp-tax.md]。

推理品質投影

方法	Mean Score	SD	% Scoring ≥ 4
Full-Schema (B1)	3.21	1.04	43.2%
Static Pruning (B2)	3.35	0.98	48.0%
Simple Retrieval (B3)	3.89	0.81	68.7%
CLI Lazy (B4)	4.02	0.77	74.1%
Tool Attention (ours)	4.43	0.62	87.6%

在長時域任務的第 30 回合時，Full-Schema 的投影推理品質降至約 2.78，而 Tool Attention 保持在約 4.31 ^[raw/papers/tool-attention-mcp-tax.md]。

解決方案

多個研究提出減輕 MCP Tax 的方法 ^[raw/papers/tool-attention-mcp-tax.md]：

Tool Attention

Tool Attention 是一種中間件層機制，將「Attention Is All You Need」範式從 token 上的自注意力推廣到工具上的門控注意力 ^[raw/papers/tool-attention-mcp-tax.md]。其核心組件包括：

Intent–Schema Overlap (ISO) Score：使用句子嵌入計算查詢與工具之間的語義相似度
State-Aware Gating Function：執行先決條件和訪問範圍的狀態感知門控
Two-Phase Lazy Schema Loader：維護簡潔的 summary pool 在上下文，並僅對 top-k gated tools 提升完整 JSON schemas

Lazy Schema Loading

僅在需要時載入完整 JSON Schema，而非每次請求都重新傳輸完整工具綱要 ^[raw/papers/tool-attention-mcp-tax.md]。Phase 1 的 Summary Pool（始終駐留）保持所有 N 個緊湊摘要（每個 ≤ 60 tokens），讓模型知道工具的存在。Phase 2 的 Schema Promotion（按需每回合）僅對入選的工具注入完整 JSON Schema。

Intent–Schema Overlap (ISO)

根據使用者意圖選擇最相關的工具，而非盲目注入所有工具綱要 ^[raw/papers/tool-attention-mcp-tax.md]。使用 sentence-transformers/all-MiniLM-L6-v2 編碼器計算餘弦相似度，典型閾值 θ ∈ [0.22, 0.32]。

Limitations

MCP 作為協定本身和 Tool Attention 作為緩解機制都存在以下局限性 ^[raw/papers/tool-attention-mcp-tax.md]：

局限性	說明
Protocol-level 無狀態	Tool Attention 是應用層緩解，無法修復缺乏會話範圍能力協商的協定層缺陷
工具摘要品質依賴	密鑰、命名不良的工具登錄表會損害檢索精度，且無法完全消除 curator 工作
模擬工作負載	評估在合成（儘管已校準）工作負載上進行；需要 SWE-bench 之類的社群標準 MCP 基準測試
對抗性 paraphrase	攻擊者可能製作語義指紋與良性用戶查詢緊密匹配的工具描述以可靠地被 gate 進入並執行 payload
跨回合狀態感知	當前查詢嵌入僅使用最新用戶消息（可選帶滾動摘要），更強的版本應基於學習的狀態表示進行條件化
學習式 gate	當前基於閾值的 gate 故意可解釋，但犧牲了準確度；蒸餾分類器可能替換閾值

參考文獻

Anthropic. “Introducing the Model Context Protocol.” Anthropic Engineering Blog, Nov. 2024.
Sadani, A. & Kumar, D. “Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows.” arXiv:2604.21816, Apr. 2026.
Wang, Z. et al. “MindGuard: Tracking, Detecting, and Attributing MCP Tool Poisoning Attack via Decision Dependence Graph.” arXiv:2508.20412, 2025.
Pan, T. “Why Your AI Agent Wastes Most of Its Context Window on Tools.” TianPan.co Blog, Jan. 2026.

Quartz 4

Explorer

MCP (Model Context Protocol)

MCP (Model Context Protocol)

Overview

Core Contributions

Protocol Architecture

3.1 協定機制

3.2 上下文視窗與利用率

3.3 安全外部性：Tool Poisoning Attack

MCP 生態

Key Results

Token Overhead Analysis

Context Utilization Data

推理品質投影

解決方案

Tool Attention

Lazy Schema Loading

Intent–Schema Overlap (ISO)

Limitations

相關頁面

參考文獻

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

MCP (Model Context Protocol)

MCP (Model Context Protocol)

Overview

Core Contributions

Protocol Architecture

3.1 協定機制

3.2 上下文視窗與利用率

3.3 安全外部性：Tool Poisoning Attack

MCP 生態

Key Results

Token Overhead Analysis

Context Utilization Data

推理品質投影

解決方案

Tool Attention

Lazy Schema Loading

Intent–Schema Overlap (ISO)

Limitations

相關頁面

Related Concepts

參考文獻

Graph View

Table of Contents

Backlinks