The PRAM Programming Language Fork

A Programming Environment for Practical Experimentation with the Theory of Parallel Computing



New: PRAM simulator pramsim and system tools now available for Linux!


Contents:


Page last updated 19 Apr 2010 by Christoph W. Kessler

PRAM model and SB-PRAM hardware emulation

The PRAM (Parallel Random Access Machine) is a strictly synchronous MIMD multiprocessor with a sequentially consistent shared memory that can be accessed in unit time. Hence, it completely abstracts from data locality, memory consistency, and communication cost and focuses on pure parallelism instead. Because of its simplicity, it is a very popular model of parallel computation in the theory of parallel algorithms.
Although criticized for being unrealistic, a PRAM has actually been realized in hardware in the 1990s by the SB-PRAM project of Wolfgang Paul's group at the University of Saarbrücken. Drawing upon hardware design techniques such as massive multithreading with cycle-by-cycle interleaving and a pipelined, combining interconnection network between processors and memory modules, it is a physical realization of a Combining Concurrent Read, Concurrent Write PRAM, the strongest PRAM model known in theory. The largest operational SB-PRAM prototype (finished 2001) has 2048 (virtual) processors (corresponding to 64 processor boards). The architecture is scalable.

Today, superscalar, VLIW and other monolithic processor architectures are hitting the limits of their performance spectrum, and leakage currents, heat dissipation and energy consumption problems put another limit on the maximum clock frequency. As a consequence, multithreaded chip multiprocessors are becoming more and more mainstream architecture. In order to obtain speed-up on such an architecture, applications must be parallelized - not only at the instruction level, but also at loop and task level, which is a notoriously complex and time-consuming task with today's parallel supercomputers. This complexity is largely caused by the need to adapt to the hardware idiosyncracies of clusters, constellations, and CC-NUMA architectures, and to worry about message passing, prefetching, data distribution, cache behavior, or memory consistency. Hence, a simple parallel programming model (and the PRAM is the simplest one) could be a realistic option for a future general-purpose programming model.
This trend is supported by recent developments in computer architecture, such as simultaneous multithreading, thread-level speculation, optical interconnects, and network-on-chip technology. Here are some more recent projects on how to realize PRAMs in hardware:

The PRAM programming language Fork

Fork is a programming language for the PRAM model; it has been implemented for the SB-PRAM.
Fork is based on ANSI C with extensions for the management of shared and private address subspaces and variables, and for static and dynamic nesting parallelism by processor group splitting constructs. The groups establish the scope of sharing and of synchronous execution. Fork offers full expressibility for many known parallel algorithmic paradigms like data parallelism, semaphore-coordinated asynchronous processes, pipelining and systolic algorithms, parallel task queue, multiprefix, parallel divide-and-conquer, and even message passing.
The language design of Fork has been developed since 1994 by Christoph W. Kessler and Helmut Seidl, starting with an initial version called Fork95 that was partially based on an earlier proposal (FORK) by Hagerup, Schmitt, and Seidl from 1989. Fork95 was extended several times from 1995 to 1999. In 1999, the language design has reached a final state, and from then on, it is just called Fork.

Example programs in Fork

Some selected Fork example programs. Their source codes are part of the Fork compiler package.

Current releases of the software packages

Recent releases are available here for download.
You need to download and install
(1) the SBPRAM simulator pramsim and the system software tools, and
(2) the Fork compiler framework.

Terms of usage:
The Fork implementation is free, but we appreciate that you cite our book Practical PRAM Programming when publishing results that are based on using it.
The distribution comes with complete source code, examples, and documentation. We have extensively used this compiler for several years now, but, as with all complex software, there are some known bugs and probably some unknown bugs left; please report if you have found one.
Anyway, we assume no liability for any damage or problems that may occur when using this software.
For some components that we borrowed from other software packages (lcc 1.9 and some routines from the GNU C library, see comments in the source code) the corresponding terms of usage apply.

Fork compiler releases

SB-PRAM simulator and system software package

In addition to the Fork Compiler distribution (see above), you need the following SB-PRAM system software tools to be able to compile and execute Fork programs.

These tools were developed for Solaris by the former SB-PRAM software team at the University of Saarbrücken during the 1990s, and ported to Linux by Bert Wesarg at FernUniversität in Hagen in 2008.

Hardware platforms and compatibility problems

The Fork compiler fcc and the SB-PRAM system tools run under Solaris (and other Unix-based systems) and Linux (only v10 of the system tools). Simulating large numbers of processors requires sufficient main memory.
Please notify us if you manage to install fcc and/or the system tools on a platform different from these listed above.
For the installation of fcc and the SB-PRAM system tools you need an ANSI C compiler such as the GNU C compiler.

The trv tool for the visualization of execution traces of Fork programs

Graphical trace file visualization of Fork program traces is obtained by adding four tracing routine calls to the program, compiling/linking it with the -T option, running the instrumented version, and submitting the generated trace file to the trv tool (here is the man page of trv in groff and in postscript format).
In the default setup, group split phases are drawn in black, wait phases at barriers are red. Processor working phases are blue/green combinations where processors belonging to the same group have the same color. Delays due to spinning on locks are drawn in yellow. Waiting phases for join entry are shown in black. join working phases are drawn in light green.
Recently, trv has been generalized to allow for fully user-specified graphics output and easy insertion of user-defined trace events. Sample configurations for color and greyscale printers are provided. Fork programs containing MPI calls for message passing result in a processor-time diagram augmented by arrows for messages. This new version of trv is available in the compiler version 2.1.

Examples for trace images:

Publications

Textbook:

Quick Reference Cards:

Slides:

Fork95 Tutorial:

Research Papers:

(as a historical overview. A complete and up-to-date description of all Fork-related issues is given in Practical PRAM Programming)

Projects based on Fork


This page by Christoph W. Kessler (chrke \at ida.liu.se)