By Michael Förster

Numerical courses frequently use parallel programming ideas reminiscent of OpenMP to compute the program's output values as effective as attainable. moreover, by-product values of those output values with admire to sure enter values play an important position. to accomplish code that computes not just the output values concurrently but additionally the spinoff values, this paintings introduces numerous source-to-source transformation principles. those principles are in accordance with a strategy referred to as algorithmic differentiation. the focus of this paintings lies at the vital opposite mode of algorithmic differentiation. The inherent data-flow reversal of the opposite mode has to be dealt with adequately through the transformation. the 1st a part of the paintings examines the differences in a really normal method considering that pragma-based parallel areas happen in lots of other kinds equivalent to OpenMP, OpenACC, and Intel Phi. the second one half describes the transformation principles of crucial OpenMP constructs.

Each thread can execute a path of statements that is different from that of the other threads. The implementation may cause any thread to suspend execution of its implicit task at a task scheduling point, and switch to execute any explicit task generated by any of the threads in the team, before eventually resuming execution of the implicit task [. ] . There is an implied barrier at the end of a parallel region. After the end of a parallel region, only the master thread of the team resumes execution of the enclosing task region.

6: Data decomposition for a typical SPMD problem. 6 is called data decomposition and allows an execution of the program where each processor fetches its own instructions and operates on its own data. The underlying model is called the Single Program Multiple Data (SPMD) model which is very often used in parallel programming [66]. Each thread gets its own chunk of data but the whole group of threads execute the same set of instructions only with different offsets for accessing the data. However, 36 1 Motivation and Introduction OpenMP could not call itself an API for abstracting low-level thread programming if it did not provide mechanisms to do this data decomposition implicitly.

The values of data in the threadprivate variables of non-initial threads are guaranteed to persist between two consecutive active parallel regions only if all the following conditions hold: • Neither parallel region is nested inside another explicit parallel region. • The number of threads used to execute both parallel regions is the same. [. " private clause To declare thread local data one can use the private clause. 1 Citation 30 (p. 96). "The private clause declares one or more list items to be private to a task.

