# LL(k)

Given a word $u$ and a context-free grammar $G$, how do we determine if $u\in L(G)$?

There are in general two ways to proceed. Start from $u$, and proceed backward to find $v$ such that $v\Rightarrow u$. Keep going until a derivation $\sigma\Rightarrow^{*}u$ is found. This procedure is known as the bottom-up parsing of $u$. The other method the top-down approach: begin with the starting symbol $\sigma$, and work its way down to $u$, so $\sigma\Rightarrow^{*}u$.

As with the bottom-up approach, finding a derivation of $u$ from the top-down may be time consuming, if one is lucky enough to find a derivation at all.

There is a class of grammars, known as the $LL(k)$ grammars, which make the top-down parsing of a word natural and direct. The first $L$ in $LL(k)$ means scanning the symbols of $u$ from left to right, the second $L$ stands for finding a leftmost derivation ($\Rightarrow_{L}$) for $u$, and $k$ means having the allowance to look at up to $k$ symbols ahead while scanning.

Definition. Let $G=(\Sigma,N,P,\sigma)$ be a context-free grammar such that $\sigma\to\sigma$ is not a production of $G$, and $k\geq 0$ an integer. Suppose $u\in L(G)$ with a $X\to U_{1}$ a production in a leftmost derivation of $u$:

 $\sigma\Rightarrow_{L}^{*}UXU_{2}\Rightarrow_{L}UU_{1}U_{2}\Rightarrow_{L}^{*}u.$

Let $n=|U|+k$ and $v$ be the prefix of $u$ of length $n$ (if $|u|, then set $v=u$).

Then $G$ is said to be $LL(k)$ if for any $w\in L(G)$, with $v$ as a prefix, such that there is a production $X\to W_{1}$ in a leftmost derivation of $w$:

 $\sigma\Rightarrow_{L}^{*}UXW_{2}\Rightarrow_{L}UW_{1}W_{2}\Rightarrow_{L}^{*}w,$

implies that $W_{1}=U_{1}$.

In a leftmost derivation $D_{u}$ of a word $u$, call a prefix $v$ of $u$ is a leftmost descendant of a production $P\to U$ if $\sigma\Rightarrow^{*}vPU^{\prime}\Rightarrow vUU^{\prime}\Rightarrow^{*}u$ is $D_{u}$. Then the definition above can be restated in words as follows:

Given a leftmost derivation $D_{u}$ of a word $u$, a production used in $D_{u}$ is uniquely determined up to $k$ symbols beyond the prefix of $u$ which is a leftmost descendant of the production. In other words, if $D_{u}$ and $D_{w}$ are leftmost derivations of $u$ and $w$ which agree on $k$ symbols beyond the common prefix $v$, where $v$ is both a leftmost descendant of $X\to U$ used in $D_{u}$, and a leftmost descendant of $X\to W$ used in $D_{w}$, then $X\to U$ and $X\to W$ are the same production, i.e. $U=W$.

Every $LL(k)$ is unambiguous. Furthermore, every $LL(k)$ grammar is $LR(k)$ (http://planetmath.org/LRk).

Given a context-free grammar $G$ and $k\geq 0$, there is an algorithm deciding whether $G$ is $LL(k)$.

Examples

• The grammar $G$ over $\Sigma=\{a,b\}$, with productions $\sigma\to a^{2}\sigma b^{2}$, $\sigma\to a$ and $\sigma\to\lambda$ is $LL(2)$ but not $LL(1)$. It is not hard to see that $L(G)$ is the set $\{a^{m}b^{n}\mid n\mbox{ is even, and }n\leq m\leq n+1\}$. On the other hand, the grammar $G^{\prime}$ over $\Sigma$, with productions

 $\sigma\to aX,\quad\sigma\to\lambda,\quad X\to aYb,\quad X\to\lambda,\quad Y\to aXb% ,\quad Y\to b$

also generates $L(G)$, but is $LL(1)$ instead.

• The grammar $G$ over $\{a,b,c\}$, with productions

 $\sigma\to X,\quad\sigma\to Y,\quad X\to aXb,\quad X\to ab,\quad Y\to aYc,\quad Y% \to ac$

is not $LL(k)$ for any $k\geq 0$.

Definition A language is said to be $LL(k)$ if it is generated by an $LL(k)$ grammar. The family of $LL(k)$ languages is denoted by $\mathscr{LL}(k)$.

It is easy to see that an $LL(0)$ contains no more than one word. Furthermore, it can be shown that

 $\mathscr{LL}(0)\subset\mathscr{LL}(1)\subset\cdots\subset\mathscr{LL}(k)% \subset\cdots,$

and the inclusion is strict. If $\mathscr{LL}(k)^{\prime}$ denotes the family of $\lambda$-free $LL(k)$ languages, then

 $\mathscr{LL}(0)^{\prime}=\mathscr{LL}(1)^{\prime}=\cdots=\mathscr{LL}(k)^{% \prime}=\cdots.$

Given two $LL(k)$ grammars $G_{1}$ and $G_{2}$, there is an algorithm that decides if $L(G_{1})=L(G_{2})$.

## References

Title LL(k) LLk 2013-03-22 19:00:54 2013-03-22 19:00:54 CWoo (3771) CWoo (3771) 8 CWoo (3771) Definition msc 68Q05 msc 68Q42 msc 03D10 LRk