A Survey of Singular Learning | AISC
For slides and more information on the paper, visit https://aisc.ai.science/events/2019-09-09
Discussion lead: Mehdi Garrousian
Motivation:
Singular Learning
This session is a survey of results from the works of Sumio Watanabe [1] on using resolution of singularity techniques from nonlinear algebra to improve learning and model selection when the Fisher information matrix of the learning machine is singular. This happens to be almost always the case!
The notion of singularity in mathematics refers to the points on an algebraic manifold where the tangent space is ill-behaved. We shall see that singularities make the learning process more challenging by substantially worsening the bias-variance tradeoff and lacking the desired convergence properties regardless of the number of training examples.
The Fisher information matrix is the Hessian of the KL-distance (loss function) at the true parameter. We follow [2] to take a closer look at how singularities are manifest in practice by examining the spectrum of the eigenvalues of the loss function for some typical neural network examples.
[1] Almost All Learning Machines are Singular, Sumio Watanabe
[2] Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond, Levent Sagun, Leon Bottou, Yann LeCun