MSc Thesis Defence by Theis Hjalte Thorn Jakobsen – University of Copenhagen

MSc Thesis Defence by Theis Hjalte Thorn Jakobsen

Convolutional Neural Network for Predicting Protein Stability by using Local Spherical Representations of Amino Acid Environments

Convolutional neural networks have shown great potential outside the field of conventional image recognition, as Boomsma and Frellsen, 2017 has shown that such networks can be used to predict amino acid propensity and stability change in protein structures, at the level of state-of-the-art stability prediction models. This success can be related to the implementation of specialized concentric spherical amino acid environments that can be used directly as input to these type of networks. In this paper, I explore the robustness of such protein stability prediction models towards small variations in protein structures, by predicting stability change on a series of modelled homologue structures. I introduce this paper with a description of convolutional neural networks and the mechanics that are integral to their success in learning features of spatially embedded data. This is followed by an introduction of two strategies for representing amino acid environments in protein structures, namely a spherical-grid and cubed-sphere representation, that can be used as input to convolutional neural networks. I show that these models are able to learn advanced features from spherical amino acid environments, found in solved protein structures, by comparing predicted amino acid propensities with other model predictions. I correlate predicted protein stabilities on their modelled protein homologue, with experimentally obtained Gibbs free energy values of their solved structures, and show that these types of models show no clear resilience towards variations that are perceivably small. However, I conclude that these types of models show great promise in the field of protein stability prediction, as they are able to predict stability change of protein structures, at the level of state-of-the-art models, without the aid of domain-specific knowledge.

Supervisor: Wouter Boomsma

External examiner: Jes Frellsen