Deep Probabilistic Programming Applied Protein Superposition: Protein Structure Prediction and Ancestral Sequence Resurrection

Research output: Book/ReportPh.D. thesisResearch

Standard

Deep Probabilistic Programming Applied Protein Superposition : Protein Structure Prediction and Ancestral Sequence Resurrection. / Sanz Moreta, Lys.

Department of Computer Science, Faculty of Science, University of Copenhagen, 2022. 138 p.

Research output: Book/ReportPh.D. thesisResearch

Harvard

Sanz Moreta, L 2022, Deep Probabilistic Programming Applied Protein Superposition: Protein Structure Prediction and Ancestral Sequence Resurrection. Department of Computer Science, Faculty of Science, University of Copenhagen.

APA

Sanz Moreta, L. (2022). Deep Probabilistic Programming Applied Protein Superposition: Protein Structure Prediction and Ancestral Sequence Resurrection. Department of Computer Science, Faculty of Science, University of Copenhagen.

Vancouver

Sanz Moreta L. Deep Probabilistic Programming Applied Protein Superposition: Protein Structure Prediction and Ancestral Sequence Resurrection. Department of Computer Science, Faculty of Science, University of Copenhagen, 2022. 138 p.

Author

Sanz Moreta, Lys. / Deep Probabilistic Programming Applied Protein Superposition : Protein Structure Prediction and Ancestral Sequence Resurrection. Department of Computer Science, Faculty of Science, University of Copenhagen, 2022. 138 p.

Bibtex

@phdthesis{76f4e8ab4d544b0181dcaf701565bd13,
title = "Deep Probabilistic Programming Applied Protein Superposition: Protein Structure Prediction and Ancestral Sequence Resurrection",
abstract = "The content of this thesis covers several concepts associated with structural bioinformatics, molecular evolution, and probabilistic programming. It includes new methods for performing protein superposition, protein structure prediction, and ancestral sequence resurrection.The first manuscript embarks into protein superposition by presenting Theseus-PP [1]. This new method uses a Bayesian approach, instead of the Maximum Likelihood method implemented in the original Theseus [2], which allows introducing relevant priors over the model{\textquoteright}s parameters. The superpositionmodel is contemplated as a new type of error loss function that will assist during protein structure inference.The second manuscript extends the previous Theseus-PP into Theseus-HMC [3], this method uses Hamiltonian Monte Carlo inference, concretely the No-U turns sampler [4], to allow the computation of uncertainty over the parameters needed for the superposition problem.The third manuscript implements an adaptation of the generative Deep Markov Model [5] for the prediction of protein fragments libraries [6]. Deep Markov Models are an extension of classical Hidden Markov Models that instead use both amortized inference and gated neural networks (such as recurrent neural networks [7] ) over the emission and transition probabilities to preserve long-range dependencies across the sequences. This new variation of the DMM benefits from Bayesian inference to compute uncertainty over the fragment{\textquoteright}s predictions.The last manuscript proposes a unique approach to Ancestral Protein Resurrection that overcomes factorized evolution and encodes sequence evolution using a tree-structured Ornstein–Uhlenbeck latent process [8].",
author = "{Sanz Moreta}, Lys",
year = "2022",
language = "English",
publisher = "Department of Computer Science, Faculty of Science, University of Copenhagen",

}

RIS

TY - BOOK

T1 - Deep Probabilistic Programming Applied Protein Superposition

T2 - Protein Structure Prediction and Ancestral Sequence Resurrection

AU - Sanz Moreta, Lys

PY - 2022

Y1 - 2022

N2 - The content of this thesis covers several concepts associated with structural bioinformatics, molecular evolution, and probabilistic programming. It includes new methods for performing protein superposition, protein structure prediction, and ancestral sequence resurrection.The first manuscript embarks into protein superposition by presenting Theseus-PP [1]. This new method uses a Bayesian approach, instead of the Maximum Likelihood method implemented in the original Theseus [2], which allows introducing relevant priors over the model’s parameters. The superpositionmodel is contemplated as a new type of error loss function that will assist during protein structure inference.The second manuscript extends the previous Theseus-PP into Theseus-HMC [3], this method uses Hamiltonian Monte Carlo inference, concretely the No-U turns sampler [4], to allow the computation of uncertainty over the parameters needed for the superposition problem.The third manuscript implements an adaptation of the generative Deep Markov Model [5] for the prediction of protein fragments libraries [6]. Deep Markov Models are an extension of classical Hidden Markov Models that instead use both amortized inference and gated neural networks (such as recurrent neural networks [7] ) over the emission and transition probabilities to preserve long-range dependencies across the sequences. This new variation of the DMM benefits from Bayesian inference to compute uncertainty over the fragment’s predictions.The last manuscript proposes a unique approach to Ancestral Protein Resurrection that overcomes factorized evolution and encodes sequence evolution using a tree-structured Ornstein–Uhlenbeck latent process [8].

AB - The content of this thesis covers several concepts associated with structural bioinformatics, molecular evolution, and probabilistic programming. It includes new methods for performing protein superposition, protein structure prediction, and ancestral sequence resurrection.The first manuscript embarks into protein superposition by presenting Theseus-PP [1]. This new method uses a Bayesian approach, instead of the Maximum Likelihood method implemented in the original Theseus [2], which allows introducing relevant priors over the model’s parameters. The superpositionmodel is contemplated as a new type of error loss function that will assist during protein structure inference.The second manuscript extends the previous Theseus-PP into Theseus-HMC [3], this method uses Hamiltonian Monte Carlo inference, concretely the No-U turns sampler [4], to allow the computation of uncertainty over the parameters needed for the superposition problem.The third manuscript implements an adaptation of the generative Deep Markov Model [5] for the prediction of protein fragments libraries [6]. Deep Markov Models are an extension of classical Hidden Markov Models that instead use both amortized inference and gated neural networks (such as recurrent neural networks [7] ) over the emission and transition probabilities to preserve long-range dependencies across the sequences. This new variation of the DMM benefits from Bayesian inference to compute uncertainty over the fragment’s predictions.The last manuscript proposes a unique approach to Ancestral Protein Resurrection that overcomes factorized evolution and encodes sequence evolution using a tree-structured Ornstein–Uhlenbeck latent process [8].

M3 - Ph.D. thesis

BT - Deep Probabilistic Programming Applied Protein Superposition

PB - Department of Computer Science, Faculty of Science, University of Copenhagen

ER -

ID: 310388138