Google DeepMind 发布 Gemma 4 QAT 检查点：Q4_0 和新移动格式降低内存

精选理由

Gemma 4 量化版来了，内存省 75%

AI 摘要

Google DeepMind 推出 Gemma 4 的 QAT（量化感知训练）检查点，包含 Q4_0 格式（4-bit 量化）和新开发的移动格式。与 BF16 版本相比，Q4_0 可将模型内存占用降低约 75%，而移动格式进一步优化至适合手机等设备。这些检查点面向边缘计算场景，平衡了精度和推理速度。

Google DeepMind 发布 Gemma 4 QAT 检查点：Q4_0 和新移动格式降低内存 — 图片来源 · marktechpost

AI 翻译 · 中文

marktechpostCompare Gemma 4 edge formats: BF16, Q4_0 QAT, and mobile QAT, on published memory numbers and design tradeoffs. The post Google DeepMind Releases Gemma 4 QAT Checkpoints: Q4_0 and a New Mobile Format Cut On-Device Memory…

Decoder06-03 19:54原文
小互06-04 00:22原文
Google AI Developers06-05 16:57原文
ollama06-05 18:32原文
Paul Couvert06-05 19:02原文
rohanpaul_ai06-06 00:26原文
Sundar Pichai06-03 19:36原文
berryxia06-04 00:22原文
Philipp Schmid06-04 14:47原文
AI Breakfast06-05 15:03原文

阅读原文