Formal Language Article Index for
Formal
Website Links For
Formal
 

Information About

Formal Language




An alphabet might be \left \{ a , b ight \}, and a string over that alphabet might be ababba.

A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols a and b.

The empty word (that is, length-zero string) is allowed and is often denoted by e, \epsilon or \Lambda. While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded).

A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to the language?"
This is the domain of Computability Theory and Complexity Theory .


EXAMPLES


Some examples of formal languages:

  • the set of all words over {a, b}

  • the set \left \{ a^{n} ight\}, n is a Prime Number and a^{n} means a repeated n times

  • Finite languages, such as {a, aa, bba} -

  • the set of syntactically correct programs in a given programming language; or

  • the set of inputs upon which a certain Turing Machine halts.



SPECIFICATION


A formal language can be specified in a great variety of ways, such as:



OPERATIONS


Several operations can be used to produce new languages from given ones. Suppose L_{1} and L_{2} are languages over some common alphabet.
  • The ''concatenation'' L_{1}L_{2} consists of all strings of the form vw where v is a string from L_{1} and w is a string from L_{2}.

  • The ''intersection'' L_1 \cap L_2 of L_{1} and L_{2} consists of all strings which are contained in L_1 and also in L_{2}.

  • The ''union'' L_1 \cup L_2 of L_{1} and L_{2} consists of all strings which are contained in L_{1} or in L_{2}.

  • The ''complement'' of the language L_{1} consists of all strings over the alphabet which are not contained in L_{1}.

  • The ''right quotient'' L_{1}/L_{2} of L_{1} by L_{2} consists of all strings v for which there exists a string w in L_{2} such that vw is in L_{1}.

  • The '' Kleene Star '' L_{1}^{---} consists of all strings which can be written in the form w_{1}w_{2}...w_{n} with strings w_{i} in L_{1} and n \ge 0. Note that this includes the empty string \epsilon because n = 0 is allowed.

  • The ''reverse'' L_{1}^{R} contains the reversed versions of all the strings in L_{1}.

  • The ''shuffle'' of L_{1} and L_{2} consists of all strings which can be written in the form v_{1}w_{1}v_{2}w_{2}...v_{n}w_{n} where n \ge 1 and v_{1},...,v_{n} are strings such that the concatenation v_{1}...v_{n} is in L_{1} and w_{1},...,w_{n} are strings such that w_{1}...w_{n} is in L_{2}.



SEE ALSO



FURTHER READING

  • Hopcroft, J. & Ullman, J (1979). ''Introduction to Automata Theory, Languages, and Computation.'' Addison Wesley. ISBN 020102988X