用熵引导注意力增强Transformer音频模型的可解释性

Listening with Attention: Entropy-Guided Explainability for Transformer-Based Audio Models

精选理由

Whisper解释性更好用了

AI 摘要

Transformer ASR模型如Whisper预测难解释。LEAF-X框架结合熵引导注意力加权、多层注意力展开和因果消融,定位低熵高影响头与层,生成稀疏token-帧归因。相比扰动解释器或原始注意力图,LEAF-X更好反映模型计算,忠诚度提升32%,局部性/稀疏性增强35-39%,归因最稳定。

AI 翻译 · 中文

Transformer ASR模型如Whisper预测难解释。LEAF-X框架结合熵引导注意力加权、多层注意力展开和因果消融,定位低熵高影响头与层,生成稀疏token-帧归因。相比扰动解释器或原始注意力图,LEAF-X更好反映模型计算,忠诚度提升32%,局部性/稀疏性增强35-39%,归因最稳定。

arXiv cs.AITransformer-based automatic speech recognition (ASR) models such as Whisper are highly accurate, but their predictions remain difficult to interpret. Existing explainable AI (XAI) methods often lack faithfulness and prec