论文标题
基于极端学习机器的Q-Networks的新颖更新机制
A Novel Update Mechanism for Q-Networks Based On Extreme Learning Machines
论文作者
论文摘要
强化学习是一种流行的机器学习范式,可以找到对复杂问题的最佳解决方案。最常见的是,这些过程涉及使用具有基于梯度的更新的神经网络的功能近似,以优化考虑的问题。尽管这种常见方法通常效果很好,但还有其他更新机制在很大程度上尚未在增强学习中探索。一种这样的机制是极端的学习机。这些最初是为了大大提高神经网络的训练速度,此后已经看到了许多应用。在这里,我们试图以与基于梯度的更新相同的方式将极端的学习机应用于加强学习问题。该新算法称为极限Q学习机(EQLM)。我们将其性能与典型的Q-Network进行了比较,Q-network(基准增强学习问题),并显示EQLM具有与Q-NETWork相似的长期学习性能。
Reinforcement learning is a popular machine learning paradigm which can find near optimal solutions to complex problems. Most often, these procedures involve function approximation using neural networks with gradient based updates to optimise weights for the problem being considered. While this common approach generally works well, there are other update mechanisms which are largely unexplored in reinforcement learning. One such mechanism is Extreme Learning Machines. These were initially proposed to drastically improve the training speed of neural networks and have since seen many applications. Here we attempt to apply extreme learning machines to a reinforcement learning problem in the same manner as gradient based updates. This new algorithm is called Extreme Q-Learning Machine (EQLM). We compare its performance to a typical Q-Network on the cart-pole task - a benchmark reinforcement learning problem - and show EQLM has similar long-term learning performance to a Q-Network.
