star-free


Let ā„› and š’¢ā¢ā„› be the families of languagesPlanetmathPlanetmath represented by regular expressionsMathworldPlanetmath and generalized regular expressions respectively. It is well-known that

ā„›=š’¢ā¢ā„›.

This means that the additional symbols āˆ© and Ā¬ in a generalized regular expressions are extraneous: removing them will not affect the representational power of the expressions with respect to regular languages.

What if we remove *, the Kleene star symbol, instead? With regard to regular expressions, removing * severely limits the power of the expressions. By inductionMathworldPlanetmath, the represented languages are all finite. In this entry, we briefly discuss what happens when * is removed from the generalized regular expressions.

Definition. Let Ī£ be an alphabet. A language L over Ī£ is said to be star-free if it can be expressed by a generalized regular expression without *. In other words, a star-free language is a language that can be obtained by applying the operationsMathworldPlanetmath of union, concatenationMathworldPlanetmath, and complementation to āˆ… and atomic languages (singleton subsets of Ī£) a finite number of times.

If we denote š’®ā¢ā„± the family of star-free languages (over some alphabet Ī£), then š’®ā¢ā„± is the smallest set of languages over Ī£ such that

  • ā€¢

    āˆ…āˆˆš’®ā¢ā„±,

  • ā€¢

    {a}āˆˆš’®ā¢ā„± for any aāˆˆĪ£,

  • ā€¢

    if L,Māˆˆš’®ā¢ā„±, then LāˆŖM,Lā¢M,Lcāˆˆš’®ā¢ā„±.

A shorter characterizationMathworldPlanetmath of a star-free language is a language with star height 0 with respect to representations by generalized regular expressions.

In relationsMathworldPlanetmathPlanetmath to finite and regular languages, we have the following:

ā„±āŠ†š’®ā¢ā„±āŠ†ā„›, (1)

where ā„± denotes the family of finite languages over Ī£.

Furthermore, it is easy to see that š’®ā¢ā„± is closed under Boolean operations, so that š’®ā¢ā„± contains infiniteMathworldPlanetmath languages, for example Ā¬ā¢āˆ… represents Ī£*. As a result, the first inclusion must be strict. This example also shows that languages representable by expressions including the Kleene star may still be star-free. Hereā€™s another example: {aā¢b}* over the alphabet {a,b}. This language can be represented as

Ī»āˆŖ(aā¢bā¢Ī£*āˆ©Ī£*ā¢aā¢bāˆ©Ā¬ā”(Ī£*ā¢a2ā¢Ī£*)āˆ©Ā¬ā”(Ī£*ā¢b2ā¢Ī£*))

The expression above, of course, is not star-free, and includes the symbol Ī» representing the empty wordPlanetmathPlanetmathPlanetmath. However, Ī£* is just Ā¬ā¢āˆ…, and Ī» is just Ī£*āˆ©Ā¬ā”(aā¢Ī£*)āˆ©Ā¬ā”(bā¢Ī£*). Some substitutions show that {aā¢b}* is indeed star-free.

Is the second inclusion strict? Are there regular languages such that representations by expressions including the Kleene star is inevitable? The following propositionPlanetmathPlanetmathPlanetmath answers the question:

Proposition 1.

A language L is star-free iff there exists a non-negative integer n such that, for any words u,v,w over Ī£, uā¢vnā¢wāˆˆL iff uā¢vn+1ā¢wāˆˆL.

A language satisfying the second statement in the proposition is known as noncounting, so the proposition can be restated as: a language is star-free iff it is noncounting.

As a result of this fact, we see that languages such as {(aā¢b)2}* is not star-free, although it is regularPlanetmathPlanetmath. Indeed, if we pick u=v=w=aā¢b as in the statement of the proposition above, we see that uā¢vnā¢w is in the language iff uā¢vn+1ā¢w is not in the language, for any nā‰„0. Therefore, the second inclusion in chain (1) above is also strict.

The above proposition also strengthens chain (1): denote by š’Æā¢(āˆž) the family of locally testable languages, then

š’Æā¢(āˆž)āŠ‚š’®ā¢ā„±āŠ‚ā„›. (2)

The first inclusion is due to the fact that, for any k-testable language L (over some Ī£), we have

swkā”(uā¢vkā¢w)=swkā”(uā¢vk+1ā¢w)

(the definition of swkā”(u) is found in the entry on locally testable languages). Note the first inclusion is also strict. For example, the language represented by (aā¢b)*āˆŖ(bā¢a)* is star-free but not locally testable.

References

  • 1 A. Ginzburg, Algebraic Theory of Automata, Academic Press (1968).
  • 2 A. Salomaa, Formal LanguagesMathworldPlanetmath, Academic Press, New York (1973).
Title star-free
Canonical name Starfree
Date of creation 2013-03-22 18:59:05
Last modified on 2013-03-22 18:59:05
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 16
Author CWoo (3771)
Entry type Definition
Classification msc 68Q42
Classification msc 68Q45
Synonym star free
Synonym non-counting
Synonym noncounting
Related topic StarHeight