2 Experimental Validation and conclusions

Experimental Validation and conclusions
Mishkin discusses experimental validation. They used FitNets from Romero (2015). A handful of experiments were used to validate results.

MNIST – achieved .5% error generally with modest improvements using LSUV and SVM
CIFAR-10/100 achieved good performance in relation to their current SOTA. LSUV was providing about a percent better (92->93) on CIFAR-10.

More experimental results show that their orthonormal based methods are superior to scaled Gaussian-noise approaches for most activation functions save tanh.

Largely, the method can be summarized in a succinct six lines of pseudocode, that provide good results. I don’t know how this compares with modern approaches to initialization, but it does have some compelling experimental backing.