Semi-automatic tool to ease the creation and optimization of GPU programs
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Semi-automatic tool to ease the creation and optimization of GPU programs. / Jepsen, Jacob.
Proceedings of the 43rd International Conference on Parallel Processing Workshops: ICPPW 2014. IEEE, 2014. p. 196-205.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Semi-automatic tool to ease the creation and optimization of GPU programs
AU - Jepsen, Jacob
PY - 2014
Y1 - 2014
N2 - We present a tool that reduces the development time of GPU-executable code. We implement a catalogue of common optimizations specific to the GPU architecture. Through the tool, the programmer can semi-automatically transform a computationally-intensive code section into GPU-executable form and apply optimizations thereto. Based on experiments, the code generated by the tool can be 3-256X faster than code generated by an OpenACC compiler, 4-37X faster than optimized CPU code, and attain up to 25% of peak performance of the GPU. We found that by using pattern-matching rules, many of the transformations can be performed automatically, which makes the tool usable for both novices and experts in GPU programming.
AB - We present a tool that reduces the development time of GPU-executable code. We implement a catalogue of common optimizations specific to the GPU architecture. Through the tool, the programmer can semi-automatically transform a computationally-intensive code section into GPU-executable form and apply optimizations thereto. Based on experiments, the code generated by the tool can be 3-256X faster than code generated by an OpenACC compiler, 4-37X faster than optimized CPU code, and attain up to 25% of peak performance of the GPU. We found that by using pattern-matching rules, many of the transformations can be performed automatically, which makes the tool usable for both novices and experts in GPU programming.
U2 - 10.1109/ICPPW.2014.36
DO - 10.1109/ICPPW.2014.36
M3 - Article in proceedings
SN - 978-1-4799-5615-9
SP - 196
EP - 205
BT - Proceedings of the 43rd International Conference on Parallel Processing Workshops
PB - IEEE
T2 - 43rd International Conference on Parallel Processing Workshops, ICPPW 2014
Y2 - 9 September 2014 through 12 September 2014
ER -
ID: 162745676