Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Asja Fischer
Igel, Christian

Learning algorithms relying on Gibbs sampling based stochastic approximations of the log-likelihood gradient have become a common way to train Restricted Boltzmann Machines (RBMs). We study three of these methods, Contrastive Divergence (CD) and its refined variants Persistent CD (PCD) and Fast PCD (FPCD). As the approximations are biased, the maximum of the log-likelihood is not necessarily obtained. Recently, it has been shown that CD, PCD, and FPCD can even lead to a steady decrease of the log-likelihood during learning. Taking artificial data sets from the literature we study these divergence effects in more detail. Our results indicate that the log-likelihood seems to diverge especially if the target distribution is difficult to learn for the RBM. The decrease of the likelihood can not be detected by an increase of the reconstruction error, which has been proposed as a stopping criterion for CD learning. Weight-decay with a carefully chosen weight-decay-parameter can prevent divergence.

Original language	English
Title of host publication	Artificial Neural Networks – ICANN 201 : 20th International Conference, Thessaloniki, Greece, September 15-18, 2010, Proceedings, Part III
Editors	K. Diamantaras, W. Duch, L. S. Iliadis
Number of pages	10
Publisher	Springer
Publication date	2010
Pages	208-217
ISBN (Print)	978-3-642-15824-7
ISBN (Electronic)	978-3-642-15825-4
DOIs	https://doi.org/10.1007/978-3-642-15825-4_26
Publication status	Published - 2010
Externally published	Yes
Event	20th International Conference on Artificial Neural Networks (ICANN 2010) - Thessaloniki, Greece Duration: 15 Sep 2010 → 18 Sep 2010

Conference

Conference	20th International Conference on Artificial Neural Networks (ICANN 2010)
Land	Greece
By	Thessaloniki
Periode	15/09/2010 → 18/09/2010

Series	Lecture notes in computer science
Volume	6354

ID: 33862803

Department of Computer Science

Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines

Conference