Google Gemma 4 12B实测：接近26B性能，VRAM减半

精选理由

新Gemma 4 12B别看参数小，实测代码能力接近26B版，而且只需要9GB显存，16GB笔记本就能跑。

AI 摘要

Google Gemma 4 12B模型在RTX 4090上实测仅需9GB VRAM，生成8.9k tokens，速度80 tok/s，性能接近26B版本。其对比的Gemma 4 26B-A4B使用15GB VRAM，生成6.9k tokens，速度138 tok/s，所有场景胜出。但12B在近半VRAM下表现十分接近，成为16GB笔记本的理想选择。

AI 翻译 · 中文

@atomic_chat_hqNew Google Gemma 4 12B claims near-26B performance - we tested both! We ran both models locally on one RTX 4090 and gave each the same task: write a self-contained HTML5 canvas animation with real physics in one file wit…

arXiv cs.AI06-18 17:59原文
SuperTechFans06-16 23:26原文
Philipp Schmid06-17 14:44原文
vLLM06-16 12:16原文

查看原推