字节跳动发布Lance：统一图像视频理解、生成与编辑的开源多模态模型

精选理由

3B参数打通图视频理解生成

AI 摘要

字节跳动智能创作实验室推出Lance，一个原生统一多模态模型，仅用3B激活参数即可处理图像与视频的理解、生成和编辑。Lance在图像理解基准MSCOCO上达到44.8的BLEU-4，在视频生成测试集UCF-101上取得FVD 159.3。该模型支持文本到图像、文本到视频、图像编辑、视频编辑等多种任务。Lance以Apache 2.0许可证开源，代码和权重已在GitHub发布。

AI 翻译 · 中文

marktechpostByteDance's Intelligent Creation Lab has released Lance, an open-source native unified multimodal model that handles image and video understanding, generation, and editing — all within a single framework, using only 3B a…

阅读原文