[next] [prev] [prev-tail] [tail] [up]

3.4 Limitations of Recursive Finite-Domain Programs

      A Pumping Lemma for Context-Free Languages
      Applications of the Pumping Lemma
      A Generalization for the Pumping Lemma

The study of the limitations of finite-memory programs in Section 2.4 relied on the following observation: A subcomputation of an accepting computation of a finite-memory program can be pumped to obtain new accepting computations if the subcomputation starts and ends at the same state. For recursive finite-domain programs similar, but somewhat more complex, conditions are needed to allow pumping of subcomputations.

A Pumping Lemma for Context-Free Languages

The proof of the following theorem uses the abstraction of context-free grammars to provide conditions under which subcomputations of recursive finite-domain programs can be pumped. The corresponding theorem for the degenerated case of finite-memory programs is implied by the choice of u = v = .

Theorem 3.4.1 (Pumping lemma for context-free languages) Every context-free language L has a positive integer constant m with the following property. If w is in L and |w| m, then w can be written as uvxyz, where uv^kxy^kz is in L for each k 0. Moreover, |vxy| m and |vy| > 0.

Proof Let G = <N, , P, S> be any context-free grammar. Use t to denote the number of symbols in the longest right-hand side of the production rules of G. With no loss of generality assume that t 2. Use |N| to denote the number of nonterminal symbols in N. Choose m to equal t^|N|+1.

Consider any w in L(G) such that |w| m. Let T denote a derivation tree for w that is minimal for w in the number of nodes. Let be a longest path from the root to a leaf in T. Let n denote the number of nodes in .

The number of leaves in T is at most t^n-1. Thus, t^n-1 |w| and |w| m = t^|N|+1 imply that n |N| + 2. That is, the path must have two nodes whose corresponding nonterminal symbols, say E and F, are equal. As a result, w can be written as uvxyz, where vxy and x are the strings that correspond to the leaves of the subtrees of T with roots E and F, respectively (see Figure 3.4.1(a)).

Figure 3.4.1

(a) A derivation tree T with E = F. (b) The derivation tree T_k.

Let T_k be the derivation tree T modified so that the subtree of E, excluding the subtree of F, is pumped k times (see Figure 3.4.1(b)). Then T_k is also a derivation tree in G for each k 0. It follows that uv^kxy^kz, which corresponds to the leaves of T_k, is also in L(G) for each k 0.

A choice of E and F from the last |N| + 1 nonterminal symbols in the path implies that |vxy| t^|N|+1 = m, because each path from E to a leaf contains at most |N| + 2 nodes. However, |vy| 0, because otherwise T₀ would also be a derivation tree for w, contradicting the assumption that T is a minimal derivation tree for w.

Example 3.4.1 Let G = <N, , P, S> be the context-free grammar whose production rules are listed below.

For G, using the terminology of the proof of the previous theorem, t = 2, |N| = 2, and m = 8. The string w = (ab)³a(ab)² has the derivation tree given in Figure 3.4.2(a).

(a)

[PICT]

(b)

[PICT]

(c)

Figure 3.4.2 (a) A derivation tree T for (ab)³a(ab)². (b) A derivation tree T_k for (ab)²(aba)^k(ab)². (c) A derivation tree T_k for ab(ab)^kaba^k(ab)².

A longest path in the tree, from the root to a leaf, contains six nodes.

w has two decompositions that satisfy the constraints of the proof of the pumping lemma. One is of the form u = ab, v = , x = ab, y = aba, z = abab; the other is of the form u = ab, v = ab, x = ab, y = a, z = abab.

(ab)²(aba)^k(ab)² and ab(ab)^kaba^k(ab)² are the new strings in the language for k 0, that the proof implies for w by pumping. Figures 3.4.2(b) and 3.4.2(c), respectively, show the derivation trees T_k for these strings.

Applications of the Pumping Lemma

The pumping lemma for context-free languages can be used to show that a language is not context-free. The method is similar to that for using the pumping lemma for regular languages to show that a language is not regular.

Example 3.4.2 Let L be the language { aⁿbⁿcⁿ | n 0 }. To show that L is not a context-free language, assume to the contrary that L is context-free. Consider the choice of w = a^mb^mc^m, where m is the constant implied by the pumping lemma for L.

By the lemma, a^mb^mc^m can be written as uvxyz, where |vxy| m, |vy| > 0, and the decomposition satisfies the following conditions.

vy contains a's or b's but not c's.
vy contains a's or c's but not b's.
vy contains b's or c's but not a's.

Moreover, by the pumping lemma, uv^kxy^kz is also in L for each k

0. However, for (a) the choice of k = 0 implies uv⁰xy⁰z not in L because of too many c's. Similarly, for (b) the choice of k = 0 implies uv⁰xy⁰z not in L because of too many b's, and for (c) the choice of k = 0 implies uv⁰xy⁰z not in L because of too many a's.

Since the pumping lemma does not hold for a^mb^mc^m, it also does not hold for L. It follows, therefore, that the assumption that L is a context-free language is false.

As in the case of the pumping lemma for regular languages the choice of the string w is of critical importance when trying to show that a language is not context-free.

Example 3.4.3 Consider the language L = { | is in {a, b}* }. To show that L is not a context-free language assume the contrary. Let m be the constant implied by the pumping lemma for L.

For the choice w = a^mb^ma^mb^m the pumping lemma implies a decomposition uvxyz such that |vxy| m and |vy| > 0. For such a choice uv⁰xy⁰z = uxz = aⁱb^ja^sb^t with either i s or j t. In either case, uxz is not in L. As a result, L cannot be context-free.

On the other hand, for the choice w = a^mba^mb a decomposition uvxyz that satisfies |vxy| m and |vy| > 0 might be of the form v = y = a^j with b in x for some j > 0. With such a decomposition uv^kxy^kz = a^m+(k-1)jba^m+(k-1)jb is also in L for all k 0. Consequently the latter choice for w does not imply the desired contradiction.

A Generalization for the Pumping Lemma

The pumping lemma for context-free languages can be generalized to relations that are computable by pushdown transducers. This generalized pumping lemma, in turn, can be used to determine relations that cannot be computed by pushdown transducers.

Theorem 3.4.2 For each relation R that is computable by a pushdown transducer, there exists a constant m such that the following holds for each (w₁, w₂) in R. If |w₁| + |w₂| m, then w₁ can be written as u₁v₁x₁y₁z₁ and w₂ can be written as u₂v₂x₂y₂z₂, where (u₁v₁^kx₁y₁^kz₁, u₂v₂^kx₂y₂^kz₂) is also in R for each k 0. Moreover, |v₁x₁y₁| + |v₂x₂y₂| m and |v₁y₁| + |v₂y₂| > 0.

Proof Consider any pushdown transducer M₁. Let M₂ be the pushdown automaton obtained from M₁ by replacing each transition rule of the form (q, , , p, , ) with a transition rule of the form (q, [, ], , p, ) if the inequality [, ] [, ] hols, and with a transition rule of the form (q, , , p, ) if the equality [, ] = [, ] holds. Let h₁ and h₂ be the projection functions defined in the following way: h₁() = h₂() = , h₁([, ]) = , h₂([, ]) = , h₁([, ]w) = h₁([, ])h₁(w), and h₂([, ]w) = h₂([, ])h₂(w).

By construction M₂ encodes in its inputs the inputs and outputs of M₁. h₁ and h₂, respectively, determine the values of these encoded inputs and outputs. As a result, (w₁, w₂) is in R(M₁) if and only if w is in L(M₂) for some w such that h₁(w) = w₁ and h₂(w) = w₂. Use m' to denote the constant implied by the pumping lemma for context-free languages for L(M₂), and choose m = 2m'.

Consider any (w₁, w₂) in the relation R(M₁) such that |w₁| + |w₂| m. Then there is some w in the language L(M₂) such that h₁(w) = w₁, h₂(w) = w₂, and |w| m/2 = m'. By the pumping lemma for context-free languages w can be written as uvxyz, where |vxy| m', |vy| > 0, and uv^kxy^kz is in L(M₂) for each k 0. The result then follows if one chooses u₁ = h₁(u), u₂ = h₂(u), v₁ = h₁(v), v₂ = h₂(v), x₁ = h₁(x), x₂ = h₂(x), y₁ = h₁(y), y₂ = h₂(y), z₁ = h₁(z), and z₂ = k₂(z).

Example 3.4.4 Let M₁ be the pushdown transducer whose transition diagram is given in Figure 3.2.3. Using the terminology of the proof of Theorem 3.4.2, M₂ is the pushdown automaton whose transition diagram is given in Figure 3.4.3.

Figure 3.4.3

A pushdown automaton that "encodes" the pushdown transducer of Figure 3.2.3.

The computation of M₁ on input aabbaa gives the output baa. The computation of M₁ on input aabbaa corresponds to the computation of M₂ on input [a,

][a,

][b,

][b, b][a, a][a, a].

[next] [prev] [prev-tail] [front] [up]