Why a system that parallelizes recursive programs automatically?
Writing efficient parallel programs still is a difficult task, even after decades of research into the topic. For data-parallel programs with regular data access patterns, automatic parallelization has made substantial progress, but irregular programs remain a challenge. An interesting subclass of irregular programs are those whose parallelism is implicit in multiple independent recursive calls that are started from a loop or statement sequence. Such procedures, often found in computations on graphs or trees, form the core of a number of real-life applications, e.g., Barnes Hut galaxy simulation or algorithms used in computer graphics.
Of course, recursive programs can be brought into iterative form, but this often results in a less understandable program and, most importantly, will still not allow for efficient automatic parallelization if the computational structure is irregular.
The purpose of this technical report is to show that automatic parallelization is practical for such recursive programs on Shared Memory Multiprocessor (SMP) machines. As a result, SMP parallelism can efficiently be used by programmers without knowledge of parallel programming.
For introducing parallelism, we replace each parallelizable recursive call by a call that possibly introduces a new thread for the current recursion step.
Additionally, recursion profile information is often useful to gain further insights into how a program works and which parallelization strategy performs best. For obtaining this information, the system automatically instruments the user program to collect and output profiles at runtime.
Together with the automatic selection of an appropriate parallelization strategy based on the recorded profiles, these components form a system called REAPAR (for "REcursive programs Automatically PARallelized" - reaping the fruits of potential parallelism).