Brian Armstrong分享如何让AI支出减半而token用量持续增长

精选理由

Coinbase创始人Brian Armstrong分享了一套实际操作方案：用更便宜的默认模型、优化缓存和路由，能把AI成本砍半。开源模型GLM 5.2和Kimi 2.7是主角，缓存命中率从5%跳到60%。

AI 摘要

Brian Armstrong在推文中分享了Coinbase控制AI成本的实践。他提到，通过将默认模型切换到开源模型如GLM 5.2和Kimi 2.7，91%的员工从未触及使用上限。通过改进缓存，LibreChat的缓存命中率从5%提升到60%。这些措施使AI支出减少近一半，同时token用量仍在增长。他还强调路由优化和精简上下文的重要性。

AI 翻译 · 中文

Clement Delanguethe future of AI is multi-model (including a majority of open-source ones provided by @huggingface of course!)! Brian Armstrong @brian_armstrong How to keep AI spend flat while token usage grows exponentially: Not with f…

查看原推