Kog开源2B模型，推理速度超3000 tokens/s

精选理由

Kog开源了一个2B模型，每秒能处理3000多个token，适合需要高速推理的任务。

AI 摘要

Kog在HuggingFace上开源了其2B参数模型，该模型此前被用于演示，运行速度达到3000+ tokens每秒。开源模型可供开发者下载和部署，适用于快速推理场景。

AI 翻译 · 中文

Kog在HuggingFace上开源了其2B参数模型，该模型此前被用于演示，运行速度达到3000+ tokens每秒。开源模型可供开发者下载和部署，适用于快速推理场景。

Clement DelangueKog open-sourced on @huggingface the 2B model that they used to show a model running at 3,000+ tokens per second. Very cool work! huggingface.co/blog/kogai/kog… 💬 5 🔄 12 ❤️ 42 👀 3610 📊 14 ⚡ Powered by xgo.ing

查看原推