论文标题
在动态变化的环境中短期词学习
Short-Term Word-Learning in a Dynamically Changing Environment
论文作者
论文摘要
使用适当的建模单元时,神经序列到序列自动语音识别(ASR)系统原则上是开放词汇系统。但是,实际上,他们通常无法识别培训期间看不到的单词,例如命名实体,数字或技术术语。为了减轻这个问题,Huber等人。建议用单词/短语内存来补充端到端的ASR系统,以及访问此内存以正确识别单词和短语的机制。在本文中,我们研究了a)动态为此记忆获取重要单词的方法; b)在识别新单词的识别准确性的改善与这些添加单词的错误警报的潜在危险之间的权衡。当使用适当数量的新单词时,我们证明了新单词的检测率的显着提高,仅较小的错误警报(F1得分0.30 $ \ rightarrow $ 0.80)。此外,我们表明可以从支持文档中提取重要的关键字并有效使用。
Neural sequence-to-sequence automatic speech recognition (ASR) systems are in principle open vocabulary systems, when using appropriate modeling units. In practice, however, they often fail to recognize words not seen during training, e.g., named entities, numbers or technical terms. To alleviate this problem, Huber et al. proposed to supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly. In this paper we study, a) methods to acquire important words for this memory dynamically and, b) the trade-off between improvement in recognition accuracy of new words and the potential danger of false alarms for those added words. We demonstrate significant improvements in the detection rate of new words with only a minor increase in false alarms (F1 score 0.30 $\rightarrow$ 0.80), when using an appropriate number of new words. In addition, we show that important keywords can be extracted from supporting documents and used effectively.
