Distributed Numerical Python

Midway PhD exam by Mads Ruben Burgdorff Kristensen


As computers have evolved, computer simulations have become a useful part of natural sciences. Computer simulations make it possible to conduct experiments that would otherwise be impractical, e.g. simulating an earthquake or a nuclear detonation. A computer simulation may become such an extensive computational task that a network of processors is needed for the simulation to finish in reasonable time. The development of such simulations requires high expertise in the relevant science field. The development and implementation is therefore often not done by computer scientists, which are experts in HPC, but rather natural scientists in that particular science field. The Holy Grail for many scientific frameworks is therefore to easy the programming, increase the productivity and support efficient parallelization.

The development of numerical simulations often consists of two implementations: A prototype and a final version. The algorithm is developed and implemented in a prototype by which the correctness of the algorithm can be verified. Typical many iterations of development are required to obtain a correct prototype, thus for this purpose a high productivity language is used, such as MATLAB. However, when the correct algorithm is finished the performance of the implementation becomes essential for doing research with the algorithm. This performance requirement presents a problem for the researcher since highly optimized code requires a fairly low-level programming language such as C/C++ or Fortran. The final version will therefore typical be a reimplementation of the prototype, which involves both changing the programming language and parallelizing the implementation.

The overall target of my work is to provide a high productivity tool that meets both the need for a high productivity tool that allows researcher to move from idea to prototype in a short time, and the need for a high performance solution that will eliminate the need for a costly and risky reimplementation. It should be possible to develop and implement an algorithm using a simple notebook and then effortlessly execute the implementation on a cluster of computers while utilizing all available CPUs.

My approach to achieve this goal has been twofold. First obtain experiment in optimizing a concrete scientific simulation for a massive parallel architecture and then develop are framework that easy the process of implementing such scientific simulations.

Supervisor: Brian Vinter, DIKU

Censor: Christian S Pedersen, DAIMI