Gemma 4 QAT 检查点发布：性能不变，内存减少约 4 倍

精选理由

做移动端或边缘部署的开发者终于可以跑 Gemma 4 了——内存降到 1GB 意味着手机和 IoT 设备也能用，建议直接去 Hugging Face 拉下来试试。

AI 摘要

Google 发布了新的 Gemma 4 QAT（量化感知训练）检查点，在保持相似性能的同时，将内存占用降低约 4 倍。该版本引入了一种新的移动端量化格式，将 Gemma 4 E2B 的内存占用降至仅 1GB。QAT 通过在训练过程中模拟低精度运算，实现无损量化，从而得到更小、更快的模型。这些检查点已在 Hugging Face 上提供，可直接运行。

AI 翻译 · 中文

Philipp SchmidMore Gemma 4! New QAT Gemma 4 checkpoints with similar performance while using ~4x less memory! It comes with a new mobile quantization format that reduces memory footprint of Gemma 4 E2B to just 1GB. Quantization-Aware …

Patrick Loeber06-09 13:17原文

查看原推