论文标题
深度学习中的确切相位过渡
Exact Phase Transitions in Deep Learning
论文作者
论文摘要
这项工作报告了深度学习的唯一一阶和二阶过渡,其现象学紧密遵循统计物理学。特别是,我们证明,训练损失中预测误差与模型复杂性之间的竞争导致网状网的二阶相变和一个隐藏层的一阶相变的一阶相变。所提出的理论与贝叶斯深度学习中后塌陷问题的优化直接相关,并指向后倒塌问题的起源。
This work reports deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. The proposed theory is directly relevant to the optimization of neural networks and points to an origin of the posterior collapse problem in Bayesian deep learning.
