Design and GPGPU performance of Futhark's redomap construct

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

This paper presents and evaluates a novel second-order operator, named 'redomap', that stems from 'map'-'reduce' compositions in the context of the purely-functional array language Futhark, which is aimed at efficient GPGPU execution. Main contributions are: First, we demonstrate an aggressive fusion technique that is centered on the 'redomap' operator. Second, we present a compilation technique for 'redomap' that efficiently sequentializes the excess parallelism and ensures coalesced access to global memory, even for non-commutative 'reduce' operators. Third, a detailed performance evaluation shows that Futhark's automatically generated code matches or exceeds performance of hand-tuned Thrust code. Our evaluation infrastructure is publicly available and we encourage replication and verification of our results.
Original languageEnglish
Title of host publicationProceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming
Number of pages8
PublisherAssociation for Computing Machinery
Publication date2016
Pages17-24
ISBN (Electronic)978-1-4503-4384-8
DOIs
Publication statusPublished - 2016
Event3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming - Santa Barbara, United States
Duration: 14 Jun 201614 Jun 2016
Conference number: 3

Conference

Conference3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming
Nummer3
LandUnited States
BySanta Barbara
Periode14/06/201614/06/2016

ID: 164443159