论文标题
通过多层次攻击发现可解释的增强学习
Interpretable Reinforcement Learning with Multilevel Subgoal Discovery
论文作者
论文摘要
我们为离散环境提出了一种新颖的增强学习模型,该模型本质上是可解释的,并支持了深度亚目标层次结构的发现。在模型中,代理商以概率规则的形式学习了有关环境的信息,而(子)目标的政策是作为其组合学习的。学习不需要奖励功能;代理只需要一个主要目标才能实现。将层次结构的目标g子量子的子观念计算为对状态的描述,如果以前实现了G的G。这些状态的描述是作为新的传感器谓词引入了代理的规则语言中的,这允许感知重要的中间状态并相应地更新环境规则和政策。
We propose a novel Reinforcement Learning model for discrete environments, which is inherently interpretable and supports the discovery of deep subgoal hierarchies. In the model, an agent learns information about environment in the form of probabilistic rules, while policies for (sub)goals are learned as combinations thereof. No reward function is required for learning; an agent only needs to be given a primary goal to achieve. Subgoals of a goal G from the hierarchy are computed as descriptions of states, which if previously achieved increase the total efficiency of the available policies for G. These state descriptions are introduced as new sensor predicates into the rule language of the agent, which allows for sensing important intermediate states and for updating environment rules and policies accordingly.
