Diffusion Gemma 速度4倍提升但事实错误多6倍

精选理由

想用更快的推理速度就得接受更多幻觉，Google官方也为此打预防针了。

AI 摘要

Diffusion Gemma 在单个H100（FP8）上速度达763 tok/s，比Gemma 4的218 tok/s快约4倍。但事实准确性测试中，Diffusion Gemma 33个事实正确、28个错误，而Gemma 4为45正确、5错误。话题越冷门错误越多：乔布斯传4错、俄罗斯方块12错、BeOS故事12错。Diffusion Gemma胡编了乔布斯的母亲名字和游戏同事名称，并将BeBox价格虚构为$9,999（实际$1,600）。

AI 翻译 · 中文

@atomic_chat_hqDiffusion Gemma is 4x faster, but makes 6x more mistakes! We benchmarked the new diffusion LLM against its autoregressive twin on a single H100 (FP8). We gave each the same three tasks: write a Steve Jobs biography, the …

arXiv cs.AI06-18 17:59原文
SuperTechFans06-16 23:26原文
Philipp Schmid06-17 14:44原文
vLLM06-16 12:16原文

查看原推