A Probabilistic Programming Approach to Protein Structure Superposition
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
A Probabilistic Programming Approach to Protein Structure Superposition. / Moreta, Lys Sanz; Al-Sibahi, Ahmad Salim; Theobald, Douglas; Bullock, William; Rommes, Basile Nicolas; Manoukian, Andreas; Hamelryck, Thomas.
2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2019. ed. / Giacomo Baruzzo; Sebastian Daberdaku; Barbara Di Camillo; Simone Furini; Emanuele Domenico Giordano; Giuseppe Nicosia. IEEE, 2019. 8791469.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - A Probabilistic Programming Approach to Protein Structure Superposition
AU - Moreta, Lys Sanz
AU - Al-Sibahi, Ahmad Salim
AU - Theobald, Douglas
AU - Bullock, William
AU - Rommes, Basile Nicolas
AU - Manoukian, Andreas
AU - Hamelryck, Thomas
PY - 2019
Y1 - 2019
N2 - Optimal superposition of protein structures is crucial for understanding their structure, function, dynamics and evolution. We investigate the use of probabilistic programming to superimpose protein structures guided by a Bayesian model. Our model THESEUS-PP is based on the THESEUS model, a probabilistic model of protein superposition based on rotation, translation and perturbation of an underlying, latent mean structure. The model was implemented in the deep probabilistic programming language Pyro. Unlike conventional methods that minimize the sum of the squared distances, THESEUS takes into account correlated atom positions and heteroscedasticity (i.e., atom positions can feature different variances). THESEUS performs maximum likelihood estimation using iterative expectation-maximization. In contrast, THESEUS-PP allows automated maximum a-posteriori (MAP)estimation using suitable priors over rotation, translation, variances and latent mean structure. The results indicate that probabilistic programming is a powerful new paradigm for the formulation of Bayesian probabilistic models concerning biomolecular structure. Specifically, we envision the use of the THESEUS-PP model as a suitable error model or likelihood in Bayesian protein structure prediction using deep probabilistic programming.
AB - Optimal superposition of protein structures is crucial for understanding their structure, function, dynamics and evolution. We investigate the use of probabilistic programming to superimpose protein structures guided by a Bayesian model. Our model THESEUS-PP is based on the THESEUS model, a probabilistic model of protein superposition based on rotation, translation and perturbation of an underlying, latent mean structure. The model was implemented in the deep probabilistic programming language Pyro. Unlike conventional methods that minimize the sum of the squared distances, THESEUS takes into account correlated atom positions and heteroscedasticity (i.e., atom positions can feature different variances). THESEUS performs maximum likelihood estimation using iterative expectation-maximization. In contrast, THESEUS-PP allows automated maximum a-posteriori (MAP)estimation using suitable priors over rotation, translation, variances and latent mean structure. The results indicate that probabilistic programming is a powerful new paradigm for the formulation of Bayesian probabilistic models concerning biomolecular structure. Specifically, we envision the use of the THESEUS-PP model as a suitable error model or likelihood in Bayesian protein structure prediction using deep probabilistic programming.
KW - Bayesian modelling
KW - deep probabilistic programming
KW - protein structure prediction
KW - protein superposition
U2 - 10.1109/CIBCB.2019.8791469
DO - 10.1109/CIBCB.2019.8791469
M3 - Article in proceedings
BT - 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2019
A2 - Baruzzo, Giacomo
A2 - Daberdaku, Sebastian
A2 - Di Camillo, Barbara
A2 - Furini, Simone
A2 - Giordano, Emanuele Domenico
A2 - Nicosia, Giuseppe
PB - IEEE
T2 - 16th IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2019
Y2 - 9 July 2019 through 11 July 2019
ER -
ID: 230480680