Token是群元素：矩阵李群上的李代数注意力

精选理由

这篇论文用群元素当token，不用那些复杂的学习核，参数还少50到80倍，做SE(2)、SO(3)和仿射群上的任务都更好，值得看看思路。

AI 摘要

论文提出Lie-Algebra Attention，其中token被定义为矩阵李群G的元素gi，而非传统特征向量。注意力分数使用相对姿态的对数范数闭合形式sij = -‖log(gi^-1 gj)‖²/τ，无需学习核函数。该方法适用于非紧致非交换的仿射群Aff(2)，这是向量token方法无法达到的。在SE(2)、SO(3)和Aff(2)上的序列补全实验中，其参数比MLP核少50-80倍，且在SE(2)上性能更优，而向量token基线的不变性误差高达5-12个数量级。

AI 翻译 · 中文

arXiv cs.LGWe place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no external action $ρ(g)$ carrying it. To our knowledge this is the f…

阅读原文