A sequence-profile-based HMM for predicting and discriminating β barrel membrane proteins

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Motivation: Membrane proteins are an abundant and functionally relevant subset of proteins that putatively include from about 15 up to 30% of the proteome of organisms fully sequenced. These estimates are mainly computed on the basis of sequence comparison and membrane protein prediction. It is therefore urgent to develop methods capable of selecting membrane proteins especially in the case of outer membrane proteins, barely taken into consideration when proteome wide analysis is performed. This will also help protein annotation when no homologous sequence is found in the database. Outer membrane proteins solved so far at atomic resolution interact with the external membrane of bacteria with a characteristic β barrel structure comprising different even numbers of β strands (β barrel membrane proteins). In this they differ from the membrane proteins of the cytoplasmic membrane endowed with alpha helix bundles (all alpha membrane proteins) and need specialised predictors. Results: We develop a HMM model, which can predict the topology of β barrel membrane proteins using, as input, evolutionary information. The model is cyclic with 6 types of states: two for the β strand transmembrane core, one for the β strand cap on either side of the membrane, one for the inner loop, one for the outer loop and one for the globular domain state in the middle of each loop. The development of a specific input for HMM based on multiple sequence alignment is novel. The accuracy per residue of the model is 83% when a jack knife procedure is adopted. With a model optimisation method using a dynamic programming algorithm seven topological models out of the twelve proteins included in the testing set are also correctly predicted. When used as a discriminator, the model is rather selective. At a fixed probability value, it retains 84% of a non-redundant set comprising 145 sequences of well-annotated outer membrane proteins. Concomitantly, it correctly rejects 90% of a set of globular proteins including about 1200 chains with low sequence identity (<30%) and 90% of a set of all alpha membrane proteins, including 188 chains. Availability: The program will be available on request from the authors.

OriginalsprogEngelsk
TidsskriftBioinformatics
Vol/bind18
Udgave nummerSuppl. 1
Sider (fra-til)S46-S53
ISSN1367-4803
DOI
StatusUdgivet - 2002
Eksternt udgivetJa

ID: 199873431