Cerebras 运行万亿参数模型 Kimi K2.6，速度达 1000 tokens/s

精选理由

Cerebras 让万亿参数模型跑出千 token 每秒

AI 摘要

Cerebras 正在企业测试中运行 Kimi K2.6，这是一个万亿参数模型。据 Artificial Analysis 测量，其推理速度约为每秒1000个 token，是迄今最快的前沿模型性能。这反驳了此前认为开源大模型无法快速运行的质疑。

AI 翻译 · 中文

Clement DelangueI remember when people were saying "It's useless to open-source big models because nobody will be able to run them fast".... Cerebras @cerebras Cerebras is now running Kimi K2.6 – a trillion parameter model…

查看原推