IDDomainSpotter: Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

IDDomainSpotter : Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors. / Millard, Peter S.; Bugge, Katrine; Marabini, Riccardo; Boomsma, Wouter; Burow, Meike; Kragelund, Birthe B.

In: Protein Science, Vol. 29, No. 1, 2020, p. 169-183.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Millard, PS, Bugge, K, Marabini, R, Boomsma, W, Burow, M & Kragelund, BB 2020, 'IDDomainSpotter: Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors', Protein Science, vol. 29, no. 1, pp. 169-183. https://doi.org/10.1002/pro.3754

APA

Millard, P. S., Bugge, K., Marabini, R., Boomsma, W., Burow, M., & Kragelund, B. B. (2020). IDDomainSpotter: Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors. Protein Science, 29(1), 169-183. https://doi.org/10.1002/pro.3754

Vancouver

Millard PS, Bugge K, Marabini R, Boomsma W, Burow M, Kragelund BB. IDDomainSpotter: Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors. Protein Science. 2020;29(1):169-183. https://doi.org/10.1002/pro.3754

Author

Millard, Peter S. ; Bugge, Katrine ; Marabini, Riccardo ; Boomsma, Wouter ; Burow, Meike ; Kragelund, Birthe B. / IDDomainSpotter : Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors. In: Protein Science. 2020 ; Vol. 29, No. 1. pp. 169-183.

Bibtex

@article{a67db6b722a44795968a5e6e32d68b21,
title = "IDDomainSpotter: Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors",
abstract = "Protein domains constitute regions of distinct structural properties and molecular functions that are retained when removed from the rest of the protein. However, due to the lack of tertiary structure, the identification of domains has been largely neglected for long (>50 residues) intrinsically disordered regions. Here we present a sequence-based approach to assess and visualize domain organization in long intrinsically disordered regions based on compositional sequence biases. An online tool to find putative intrinsically disordered domains (IDDomainSpotter) in any protein sequence or sequence alignment using any particular sequence trait is available at https://www.bio.ku.dk/sbinlab/IDDomainSpotter. Using this tool, we have identified a putative domain enriched in hydrophilic and disorder-promoting residues (Pro, Ser, and Thr) and depleted in positive charges (Arg and Lys) bordering the folded DNA-binding domains of several transcription factors (p53, GCR, NAC46, MYB28, and MYB29). This domain, from two different MYB transcription factors, was characterized biophysically to determine its properties. Our analyses show the domain to be extended, dynamic and highly disordered. It connects the DNA-binding domain to other disordered domains and is present and conserved in several transcription factors from different families and domains of life. This example illustrates the potential of IDDomainSpotter to predict, from sequence alone, putative domains of functional interest in otherwise uncharacterized disordered proteins.",
keywords = "compositional bias, DNA-binding domain, domain, IDDomainSpotter, IDPs, low-complexity regions, NMR, p53, plant MYB protein, transactivation domain, transcription factor",
author = "Millard, {Peter S.} and Katrine Bugge and Riccardo Marabini and Wouter Boomsma and Meike Burow and Kragelund, {Birthe B.}",
note = "Special Issue: Tools for Protein Science",
year = "2020",
doi = "10.1002/pro.3754",
language = "English",
volume = "29",
pages = "169--183",
journal = "Protein Science",
issn = "0961-8368",
publisher = "Wiley-Blackwell",
number = "1",

}

RIS

TY - JOUR

T1 - IDDomainSpotter

T2 - Compositional bias reveals domains in long disordered protein regions—Insights from transcription factors

AU - Millard, Peter S.

AU - Bugge, Katrine

AU - Marabini, Riccardo

AU - Boomsma, Wouter

AU - Burow, Meike

AU - Kragelund, Birthe B.

N1 - Special Issue: Tools for Protein Science

PY - 2020

Y1 - 2020

N2 - Protein domains constitute regions of distinct structural properties and molecular functions that are retained when removed from the rest of the protein. However, due to the lack of tertiary structure, the identification of domains has been largely neglected for long (>50 residues) intrinsically disordered regions. Here we present a sequence-based approach to assess and visualize domain organization in long intrinsically disordered regions based on compositional sequence biases. An online tool to find putative intrinsically disordered domains (IDDomainSpotter) in any protein sequence or sequence alignment using any particular sequence trait is available at https://www.bio.ku.dk/sbinlab/IDDomainSpotter. Using this tool, we have identified a putative domain enriched in hydrophilic and disorder-promoting residues (Pro, Ser, and Thr) and depleted in positive charges (Arg and Lys) bordering the folded DNA-binding domains of several transcription factors (p53, GCR, NAC46, MYB28, and MYB29). This domain, from two different MYB transcription factors, was characterized biophysically to determine its properties. Our analyses show the domain to be extended, dynamic and highly disordered. It connects the DNA-binding domain to other disordered domains and is present and conserved in several transcription factors from different families and domains of life. This example illustrates the potential of IDDomainSpotter to predict, from sequence alone, putative domains of functional interest in otherwise uncharacterized disordered proteins.

AB - Protein domains constitute regions of distinct structural properties and molecular functions that are retained when removed from the rest of the protein. However, due to the lack of tertiary structure, the identification of domains has been largely neglected for long (>50 residues) intrinsically disordered regions. Here we present a sequence-based approach to assess and visualize domain organization in long intrinsically disordered regions based on compositional sequence biases. An online tool to find putative intrinsically disordered domains (IDDomainSpotter) in any protein sequence or sequence alignment using any particular sequence trait is available at https://www.bio.ku.dk/sbinlab/IDDomainSpotter. Using this tool, we have identified a putative domain enriched in hydrophilic and disorder-promoting residues (Pro, Ser, and Thr) and depleted in positive charges (Arg and Lys) bordering the folded DNA-binding domains of several transcription factors (p53, GCR, NAC46, MYB28, and MYB29). This domain, from two different MYB transcription factors, was characterized biophysically to determine its properties. Our analyses show the domain to be extended, dynamic and highly disordered. It connects the DNA-binding domain to other disordered domains and is present and conserved in several transcription factors from different families and domains of life. This example illustrates the potential of IDDomainSpotter to predict, from sequence alone, putative domains of functional interest in otherwise uncharacterized disordered proteins.

KW - compositional bias

KW - DNA-binding domain

KW - domain

KW - IDDomainSpotter

KW - IDPs

KW - low-complexity regions

KW - NMR

KW - p53

KW - plant MYB protein

KW - transactivation domain

KW - transcription factor

U2 - 10.1002/pro.3754

DO - 10.1002/pro.3754

M3 - Journal article

C2 - 31642121

AN - SCOPUS:85075199244

VL - 29

SP - 169

EP - 183

JO - Protein Science

JF - Protein Science

SN - 0961-8368

IS - 1

ER -

ID: 234448370