This week I combined the multi-layer perceptron with a restricted Boltzman machine to create my first deep learning machine.
I trained the machine to categorize the MNIST dataset, a commonly used benchmarking dataset. It consists of handwritten arabic numerals.
Above you can see the MLP training on the outputs of the autoencoder as well as the final validation accuracy results for each class. The overall final accuracy is 92.8%, which is about as much as I could hope to achieve with this type of machine.
The “5” class always performs the worst on all the configurations I’ve tried.
Below is a visualization using t-SNE for dimensionality reduction on the MNIST set. You can see why the “5” class performs the worst. The “5” class lies centrally and is close to several different classes.
I picked my initial parameters from Reducing the Dimensionality of Data with Neural Networks by Geoffrey Hinton and R. R. Salakhutdinov. I guessed on a learning rate of 1% and training on 200 epochs since the paper suggested low learning rates and high training volume worked best. I really should’ve trained more epochs, but I needed to free up my compute resources for other class work.
My initial parameters worked so well, I haven’t found any substantial way to improve on them.
I’ve tried different sizes and depths of MLP layers, more aggressive learning rates, and more dimensionality reduction with the autoencoder but nothing significantly improved the results.
- experiment with additional image preprocessing beyond the threshold filter.
- e.g. erosion and dilation to eliminate isolated pixels and smoothing the outlines.
- programmatically test different parameters with a faster machine
- more training epochs