In W. Cheng and A. S. M. Sajeev, editors, Proceedings of 6th Annual Australasian Conference on Parallel And Real-Time Systems (PART '99) , Springer-Verlag, 1999.
Research on the high-performance implementation of nested data parallelism has, over time, covered a wide range of architectures. Scalar and vector processors as well as shared-memory and distributed memory machines were targeted. We are currently investigating methods to integrate this technology into a single portable compiler back-end. Essential to our approach are two program transformations, flattening and calculational fusion, which even out irregular parallelism and increase locality of reference, respectively. We generate C code that makes use of a portable, light-weight, collective-communication library. First experiments on scalar, vector, and distributed-memory machines support the feasibility of the approach.
PostScript version (14 pages).
This page is part of Manuel Chakravarty's WWW-stuff.