PCOMP: Compiler from a high level parallel programming language with nested parallel constructs to DSM systems
SIM: Simulator for the high level parallel programming language(ParC) for DOS and Unix (AIX).
Parallel Walk
going somewhere?
Objectives
The goal of the the PCOMP project is to apply compiler technology to speed the execution of
``high level'' parallel programs executed by a Distributed Shared Memory (DSM) system
admitting only ``low level'' programming style. By this we mean that the
compiler optimizes the code of a given program so that the simulation of the shared memory by page
movements of the underlying DSM system improves. This type of optimizations differs from traditional
compiler optimizations for parallel processing such as Automatic parallelization of sequential loops,
translating shared memory references to message passing (Shasta, HPF) and partition of shared
arrays. In particular, these optimizations may require changes in the underlying DSM system.
We used the ParC language as the source
language, consequently the compiler has to support arbitrary nesting of parallel loops and parallel
recursive function calls.
The current version of PCOMP supports the following features:
-
Local locking of memory pages needed for the execution of a heavy loop, so that these pages
will not easily migrate to a remote machine during the execution of that loop. In this way we reduce the
overhead involved with too frequent page movements, and force the remote machine to execute
another activity of the program which does not use the locked pages.
Consequently, The DSM system includes support for page locking.
-
Explicit insertion of context switch instructions in while-loops so that implicit synchronization
(using shared variables) is supported with minimal overhead.
Special envelope functions used to spawn new ParC activities. In here we attempt
to map most of the activities generated by ParC program to local function calls or loop iterations.
In this way the overhead involved with dynamic generation of operating system threads
used to execute ParC's activities, is minimized.
The main difficulty of such a mapping is not
to allocate a ParC activity which ``releases'' a while-loop of another activity
to the same operating system thread which executes the ``while-loop'' activity.
Current State
We have a running prototype of the compiler and the underlying DSM kernel for
a cluster of NT-machines. The underlying DSM kernel support Sequential Consistency
and multiple copies of read-only pages.
The Software has been fully tested and bugs are probably crawling all over,
however, it worked for simple programs 50-200 lines showing nice speed ups.
A new version, including: A) an improved DSM layer with broadcasts and weak consistency memory model
and B) some support for automatic parallelization in the PCOMP compiler
is expected to be available around 1.7.99.
The new version will contain more forms of DSM optimizations where the compiler
insert explicit instructions to send memory pages, hoping to save the overhead
involved with page-faults interrupts.
Hopefully, this version will contain less bugs than this one.
In addition to the PCOMP compiler and its NT DSM, this cite offers a simulator
for ParC programs including two versions:
-
A student version for DOS, which can be used to practice practical parallel programming.
-
A Unix version (AIX and SUN) which counts how many of the shared memory references
made by a ParC program are guarantee to be executed locally, and how many might cause
remote page-transfer. This is based on the fact that ParC's semantics distinguishes between
local memory references and external memory references.
Email:
yosi@cs.haifa.ac.il
Last Updated: Wed Mar 17 10:14:24 IST 1999