[next] [tail] [up]

7.1 Parallel Programs

A parallel program is a system <P, X, Y> of infinitely many deterministic sequential programs P₁, P₂, . . . , infinitely many input variables X(1), X(2), . . . , and infinitely many output variables Y(1), Y(2), . . . The sequential programs P₁, P₂, . . . are assumed to be identical, except for the ability of each P_i to refer to its own index i. That is, for each pair of indices i and j the sequential program P_j can be obtained from the sequential program P_i by replacing each reference to i in P_i with a reference to j.

At the start of a computation, the input of is stored in its input variables. An input that consists of N values is stored in X(1), . . . , X(N), where each of the variables holds one of the input values. During the computation, employs P₁, . . . , P_m for some m dependent on the input. Each P_i is assumed to know the value of N and the value of m. Upon halting, the output of is assumed to be in its output variables. An output that consists of K values is assumed to be in Y(1), . . . , Y(K), where each of the variables holds one output value.

Each step in a computation of consists of four phases as follows.

Each P_i reads an input value from one of the input variables X(1), . . . , X(N).
Each P_i performs some internal computation.
Each P_i may write into one of the output variables Y(1), Y(2), . . .
P₁, . . . , P_m communicate any desirable information among themselves.

Each of the phases is synchronized to be carried in parallel by all the sequential programs P₁, . . . , P_m.

Although two or more sequential programs may read simultaneously from the same input variable, at no step may they write into the same output variable.

The depth of a computation of a parallel program = <P, X, Y> is the number of steps executed during the computation. The parallel program is said to have depth complexity D(N) if for each N all its computations, over the inputs that consist of N values, have at most depth D(N). The parallel program is said to have size complexity Z(N) if it employs no sequential programs other than P₁, . . . , P_Z(N) on each input that consists of N values.

The time required by a computation of a parallel program and that program's time complexity can be defined in a similar way. However, such notions are unmeasurable here because we have not yet specified how sequential programs communicate.

Example 7.1.1 Consider the problem Q of selecting the smallest value in a given set S. Restrict your attention to parallel programs that in each step allow each sequential program to receive information from no more than one sequential program.

The problem is solvable by a parallel program ₁ = <P, X, Y> of size complexity Z(N) N(N - 1)/2 and a constant depth complexity, where N denotes the cardinality of the given set S. The parallel program can use a brute-force approach for such a purpose.

Specifically, let each pair (i₁, i₂), such that 1 i₁ < i₂ N, correspond to a different i, such that 1 i N(N - 1)/2. For instance, the correspondence can be of the form i = 1 + 2 + + (i₂ - 2) + i₁ = (i₂ - 2)(i₂ - 1)/2 + i₁ (see Figure 7.1.1).

Figure 7.1.1

An ordering i on the pairs (i₁, i₂), such that 1

i₁ < i₂.

Let P_(i₁,i₂) denote the sequential program P_i, where (i₁, i₂) is the pair that corresponds to i.

Each computation of starts with a step in which each P_i derives the pair (i₁, i₂) that corresponds to i, 1 i N(N - 1)/2. The computation continues with two steps in which each P_(i₁,i₂) reads the elements of S that are stored in X(i₁) and X(i₂). In addition, in the third step each P_(i₁,i₂) compares the values read from X(i₁) and X(i₂), and communicates a "negative" outcome to P_i₁ or P_i₂. This outcome is communicated to P_i₁ if X(i₁) X(i₂). Otherwise, the outcome is communicated to P_i₂. During the fourth step, the only active sequential program is P_j, 1 j N, which did not receive a negative outcome. During that step P_j reads the value of X(j) and writes it out into Y(1). The computation terminates after the fourth step.

The problem Q can be solved also by a parallel program ₂ = <P, X, Y> of size complexity Z(N) = N/2 and depth complexity D(N) = O(log N). In this case the program simply repeatedly eliminates about half of the elements from S, until S is left with a single element.

At the first stage of each computation each P_i, 1 i N/2, reads the values stored in X(2i - 1) and X(2i). In addition, each P_i compares the values that it read. If X(2i - 1) < X(2i), then P_i communicates to P_i/2 the value of X(2i - 1). Otherwise, P_i communicates to P_i/2 the value of X(2i). At the end of the first stage P₁, . . . , P_n/2/2 hold the elements of S that have not been eliminated yet.

At the start of each consecutive stage of the computation, a sequential program P_i determines itself active if and only if it has been communicated some values of S in the previous stage. During a given stage, each active P_i compares the values a₁ and a₂ that were communicated to it in the previous stage. If the values satisfy the relation a₁ < a₂, then P_i communicates a₁ to P_i/2. Otherwise, P_i communicates a₂ to P_i/2.

After O(log N) stages only P₁ is active, and it holds a single value of S. Then P₁ writes the value into Y(1) and the computation terminates.

Figure 7.1.2

Figure 7.1.2

Flow of information.

illustrates the flow of information in

₂ during a computation of the parallel program.

Similarly, the problem Q can be solved by a parallel program ₃ = <P, X, Y> of size complexity Z(N) < N/2 and depth complexity O(N/Z(N) + log Z(N)). At the start of each computation each P_i computes m = Z(N) and finds independently in O(N/m) steps the smallest value in X((i - 1) + 1), . . . , X(i). Then, as in the previous case of ₂, P₁, . . . , P_m proceed in parallel to determine in O(log m) steps the smallest value among the m values that they hold.

[next] [front] [up]