北大与DeepSeek开源DSpark，LLM推理效率大幅提升

精选理由

北大和DeepSeek搞的DSpark，不用改模型就能让推理快80%，吞吐量翻好几倍，适合做部署的试试。

AI 摘要

北京大学与DeepSeek联合开源了投机解码框架DSpark，该框架无需修改模型即可将LLM推理速度提升60-85%。在严格延迟约束下，吞吐量增益最高达661%。DSpark通过高效的投机解码策略显著降低推理延迟。这一成果已在GitHub上开源。

AI 翻译 · 中文

PandailyPeking University and DeepSeek jointly open-source DSpark, a speculative decoding framework that boosts LLM inference speed by 60-85% with up to 661% throughput gain under strict latency constraints.

阅读原文