Training of restricted Boltzmann Machines – Københavns Universitet

Training of restricted Boltzmann Machines

PhD-defense by Asja Fischer

Abstract

Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can also be interpreted as stochastic neural networks. Training RBMs is known to be challenging. Computing the likelihood of the model parameters or its gradient is in general computationally intensive. Thus, training relies on sampling based approximations of the log-likelihood gradient.

I will present an empirical and theoretical analysis of the bias of these approximations and show that the approximation error can lead to a distortion of the learning process. The bias decreases with increasing mixing rate of the applied sampling procedure and I will introduce a transition operator that leads to faster mixing. Finally, a different parametrisation of RBMs will be discussed that leads to better learning results and more robustness against changes in the data representation.

Assessment Committee:

Chairman: Professor Mads Nielsen, Department of Computer Science, Copenhagen University

Member 1: Professor Klaus-Robert Muller, TU Berlin, Germany

Member 2: Assistant Professor Aaron Courville, Departement d’Informatatique et de recherché operationnelle. Universite de Montreal,Canada

Academic advisor: Professor Christian Igel, Department of Computer Science, Copenhagen University

For an electronic copy of the thesis, please contact Susan Nasirumbi Ipsen, suntonn@di.ku.dk