Integrative dynamic reconfiguration in a parallel stream processing engine

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

Integrative dynamic reconfiguration in a parallel stream processing engine. / Madsen, Kasper Grud Skat; Zhou, Yongluan; Cao, Jianneng.

Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE). IEEE Press, 2017. s. 227-230.

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Madsen, KGS, Zhou, Y & Cao, J 2017, Integrative dynamic reconfiguration in a parallel stream processing engine. i Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE). IEEE Press, s. 227-230, 33rd IEEE International Conference on Data Engineering, San Diego, California, USA, 19/04/2017. https://doi.org/10.1109/ICDE.2017.81

APA

Madsen, K. G. S., Zhou, Y., & Cao, J. (2017). Integrative dynamic reconfiguration in a parallel stream processing engine. I Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE) (s. 227-230). IEEE Press. https://doi.org/10.1109/ICDE.2017.81

Vancouver

Madsen KGS, Zhou Y, Cao J. Integrative dynamic reconfiguration in a parallel stream processing engine. I Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE). IEEE Press. 2017. s. 227-230 https://doi.org/10.1109/ICDE.2017.81

Author

Madsen, Kasper Grud Skat ; Zhou, Yongluan ; Cao, Jianneng. / Integrative dynamic reconfiguration in a parallel stream processing engine. Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE). IEEE Press, 2017. s. 227-230

Bibtex

@inproceedings{b1be9b253d044ab7a4f2c311e551bb60,
title = "Integrative dynamic reconfiguration in a parallel stream processing engine",
abstract = "Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches.",
author = "Madsen, {Kasper Grud Skat} and Yongluan Zhou and Jianneng Cao",
year = "2017",
doi = "10.1109/ICDE.2017.81",
language = "English",
isbn = "978-1-5090-6544-8",
pages = "227--230",
booktitle = "Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE)",
publisher = "IEEE Press",
note = "33rd IEEE International Conference on Data Engineering, ICDE 2017 ; Conference date: 19-04-2017 Through 22-04-2017",

}

RIS

TY - GEN

T1 - Integrative dynamic reconfiguration in a parallel stream processing engine

AU - Madsen, Kasper Grud Skat

AU - Zhou, Yongluan

AU - Cao, Jianneng

N1 - Conference code: 33

PY - 2017

Y1 - 2017

N2 - Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches.

AB - Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches.

U2 - 10.1109/ICDE.2017.81

DO - 10.1109/ICDE.2017.81

M3 - Article in proceedings

SN - 978-1-5090-6544-8

SP - 227

EP - 230

BT - Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE)

PB - IEEE Press

T2 - 33rd IEEE International Conference on Data Engineering

Y2 - 19 April 2017 through 22 April 2017

ER -

ID: 179278061