面向LLM智能体工作流并行分支的直接潜在空间合成

精选理由

并行合成提速2.5-11倍

AI 摘要

Parallel-Synthesis框架使合成器直接消费并行工作线程的KV缓存，避免文本拼接冗余。它通过缓存映射器校准独立分支缓存，并微调合成适配器以支持非顺序缓存接口。在9个数据集（数学、科学问答、代码生成、GAIA、多智能体数据库诊断）上，7个超越或持平文本合成基线，首token延迟降低2.5-11倍。该工作为并行智能体分支的高效合成提供了新接口。

AI 翻译 · 中文

arXiv cs.AILarge language models increasingly serve as execution engines for agentic systems, yet they still consume context through a sequential text interface. This creates a mismatch with modern structured agent workflows, in wh…

阅读原文