Exploiting functional invariants to optimise parallelism: A dataflow approach – Københavns Universitet

Exploiting functional invariants to optimise parallelism: A dataflow approach

Abstract:

We present L0, a purely functional programing language supporting nested
regular data parallelism and targeting massively parallel SIMD hardware
such as modern graphics processing units (GPUs).

L0 incorporates the following novel features:

  • A type system for in-place modification and aliasing of arrays and
    array slices that ensures referential transparency, which in turn
    supports equational reasoning.
  • An assertion language for expressing bounds checks on dynamically
    allocated arrays, which can often be checked statically to eliminate
    dynamic bounds checks.
  • Compiler optimizations for hoisting bounds checks out of inner loops
    and performing loop fusion based on structural transformations.

We show that:

  • The type system is simpler than existing linear and unique typing
    systems such Clean [Barendsen 1996], and more expressive than
    libraries such as Repa and Accelerate [Keller 2010, Chakravarty
    2011], for efficient array processing.
  • Our fusion transformation is capable of fusing loops whose output is
    used in multiple places, when possible without duplicating
    computation, a feature not found in other implementations of fusion
    [Jones 2001].
  • The effectiveness of our optimizations is demonstrated on three
    real-world benchmark problems from quantitative finance, based on
    empirical run-time measurements and manual inspection of the
    optimised programs. In particular, hoisting and fusion yield a
    sequential speed-up of up to 71% compared to the unoptimized source
    code, even without any parallel execution.

The results suggest that the language design, expressiveness and
optimization techniques of L0 can be realized across a range of
SIMD-architectures, notably existing GPUs and manycore-chips, to
simultaneously achieve portability and eventually performance
competitive with hand-coding.

The results reported are based on joint work with Cosmin Oancea, DIKU.

Advisors: Cosmin Oancea, Fritz Henglein (DIKU)
External evaluator: Peter Sestoft (ITU)

Both presentation (13:00-13:30) and oral examination (13:30-14:00) are public. All are welcome.