Information AboutParser |
|
Parsing transforms input text into a data structure, usually a tree, which is suitable for later processing and which captures the implied hierarchy of the input. Generally, parsers operate in two stages, first identifying the meaningful Token s in the input, and then building a Parse Tree from those tokens. Human languages In some Machine Translation and Natural Language Processing systems, human languages are parsed by computer programs. Human sentences are not easily parsed by programs, as there is substantial ambiguity in the structure of human language. Programming languages The most common use of parsers is to parse Computer Programming Language s. These have simple and regular grammars. Parsers for programming languages tend to be based on Context-free Grammar s because fast and efficient parsers can be written for them. However, context-free grammars are limited in their expressiveness because they can describe only a limited set of languages. Informally, the reason is that the memory of such a language is limited. The grammar cannot remember the presence of a construct over an arbitrarily long input; this is necessary for a language in which, for example, a name must be declared before it may be referenced. More powerful grammars, however, cannot be parsed efficiently. Thus, it is a common strategy to create a relaxed parser for a context-free grammar which accepts a superset of the desired language constructs (that is, it accepts some invalid constructs); later, the unwanted constructs can be filtered out. It is usually easy to define a context-free grammar which includes all desired language constructs; on the other hand, it is often impossible to create a context-free grammar which admits ''only'' the desirable constructs. Parsers are usually not written by hand but are generated by Parser Generator s. Overview of process The example below demonstrates the common case of parsing a language with two levels of grammar: lexical and syntactic.
The next stage is syntactic parsing or Syntax Analysis , which is checking that the tokens form an allowable expression. This is usually done with reference to a Context-free Grammar which recursively defines components that can make up an expression and the order in which they must appear. The final phase is Semantic Parsing or analysis, which is working out the implications of the expression just validated and taking the appropriate action. In the case of a calculator, the action is to evaluate the expression; a compiler, on the other hand, would generate code. Types of parsers The task of the parser is essentially to determine if and how the input can be derived from the Start Symbol within the rules of the formal grammar. This can be done in essentially two ways:
Another important distinction is whether the parser generates a ''leftmost derivation'' or a ''rightmost derivation'' (see Context-free Grammar ). LL parsers will generate a leftmost Derivation and LR parsers will generate a rightmost derivation (although usually in reverse). Examples of parsers Top-down parsers Some of the parsers that use Top-down Parsing include: Bottom-up parsers Some of the parsers that use Bottom-up Parsing include:
See also
References External links
|