Design and GPGPU performance of Futhark's redomap construct

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

Design and GPGPU performance of Futhark's redomap construct. / Henriksen, Troels; Larsen, Ken Friis; Oancea, Cosmin Eugen.

Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming. Association for Computing Machinery, 2016. s. 17-24.

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Henriksen, T, Larsen, KF & Oancea, CE 2016, Design and GPGPU performance of Futhark's redomap construct. i Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming. Association for Computing Machinery, s. 17-24, 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, Santa Barbara, USA, 14/06/2016. https://doi.org/10.1145/2935323.2935326

APA

Henriksen, T., Larsen, K. F., & Oancea, C. E. (2016). Design and GPGPU performance of Futhark's redomap construct. I Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (s. 17-24). Association for Computing Machinery. https://doi.org/10.1145/2935323.2935326

Vancouver

Henriksen T, Larsen KF, Oancea CE. Design and GPGPU performance of Futhark's redomap construct. I Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming. Association for Computing Machinery. 2016. s. 17-24 https://doi.org/10.1145/2935323.2935326

Author

Henriksen, Troels ; Larsen, Ken Friis ; Oancea, Cosmin Eugen. / Design and GPGPU performance of Futhark's redomap construct. Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming. Association for Computing Machinery, 2016. s. 17-24

Bibtex

@inproceedings{4e16c29ba93247b990f149a346353c3f,
title = "Design and GPGPU performance of Futhark's redomap construct",
abstract = "This paper presents and evaluates a novel second-order operator, named 'redomap', that stems from 'map'-'reduce' compositions in the context of the purely-functional array language Futhark, which is aimed at efficient GPGPU execution. Main contributions are: First, we demonstrate an aggressive fusion technique that is centered on the 'redomap' operator. Second, we present a compilation technique for 'redomap' that efficiently sequentializes the excess parallelism and ensures coalesced access to global memory, even for non-commutative 'reduce' operators. Third, a detailed performance evaluation shows that Futhark's automatically generated code matches or exceeds performance of hand-tuned Thrust code. Our evaluation infrastructure is publicly available and we encourage replication and verification of our results. ",
author = "Troels Henriksen and Larsen, {Ken Friis} and Oancea, {Cosmin Eugen}",
year = "2016",
doi = "10.1145/2935323.2935326",
language = "English",
pages = "17--24",
booktitle = "Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming",
publisher = "Association for Computing Machinery",
note = "3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY 2016 ; Conference date: 14-06-2016 Through 14-06-2016",

}

RIS

TY - GEN

T1 - Design and GPGPU performance of Futhark's redomap construct

AU - Henriksen, Troels

AU - Larsen, Ken Friis

AU - Oancea, Cosmin Eugen

N1 - Conference code: 3

PY - 2016

Y1 - 2016

N2 - This paper presents and evaluates a novel second-order operator, named 'redomap', that stems from 'map'-'reduce' compositions in the context of the purely-functional array language Futhark, which is aimed at efficient GPGPU execution. Main contributions are: First, we demonstrate an aggressive fusion technique that is centered on the 'redomap' operator. Second, we present a compilation technique for 'redomap' that efficiently sequentializes the excess parallelism and ensures coalesced access to global memory, even for non-commutative 'reduce' operators. Third, a detailed performance evaluation shows that Futhark's automatically generated code matches or exceeds performance of hand-tuned Thrust code. Our evaluation infrastructure is publicly available and we encourage replication and verification of our results.

AB - This paper presents and evaluates a novel second-order operator, named 'redomap', that stems from 'map'-'reduce' compositions in the context of the purely-functional array language Futhark, which is aimed at efficient GPGPU execution. Main contributions are: First, we demonstrate an aggressive fusion technique that is centered on the 'redomap' operator. Second, we present a compilation technique for 'redomap' that efficiently sequentializes the excess parallelism and ensures coalesced access to global memory, even for non-commutative 'reduce' operators. Third, a detailed performance evaluation shows that Futhark's automatically generated code matches or exceeds performance of hand-tuned Thrust code. Our evaluation infrastructure is publicly available and we encourage replication and verification of our results.

U2 - 10.1145/2935323.2935326

DO - 10.1145/2935323.2935326

M3 - Article in proceedings

SP - 17

EP - 24

BT - Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming

PB - Association for Computing Machinery

T2 - 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming

Y2 - 14 June 2016 through 14 June 2016

ER -

ID: 164443159