| Earley Parser |
Website Links For Earley |
Information AboutEarley Parser |
| CATEGORIES ABOUT EARLEY PARSER | |
| parsing algorithms | |
| SHOPPER'S DELIGHT | |
|
Earley parsers are appealing because they can parse all Context-free Language s. The Earley parser executes in cubic time in the general case, and quadratic time for unambiguous grammars. It performs particularly well when the rules are written Left-recursively . PERFORMING THE ALGORITHM To understand how Earley's algorithm, a top-down dynamic-programming algorithm, executes, you have to understand Dot Notation . Given a production A → BCD (where B, C, and D are symbols in the grammar, terminals or nonterminals), the notation A → B • C D represents a condition in which B has already been parsed and the sequence C D is expected. For every input position (which represents a position ''between'' Tokens ), the parser generates a ''state set''. Each state is the Cartesian Product (that is, just the combination) of:
The state set at input position ''k'' is called S(''k''). The parser is seeded with S(0) being the top-level rule. The parser then iteratively operates in three stages: ''prediction'', ''scanning'', and ''completion''. In the following descriptions, α, β, and γ represent any sequence of terminals/nonterminals (including the null sequence), X, Y, and Z represent single nonterminals, and ''a'' represents a terminal symbol.
These steps are repeated until no more states can be added to the set. This is generally realized by making a queue of states to process, and performing the corresponding operation depending on what kind of state it is. For the implementor, it is important to note that this is a ''set'' of states, and that you not add two identical states to the same set. EXAMPLE The algorithm is hard to see from the abstract description above. It becomes much clearer how it operates once you see it in action. The output is a little verbose, but you should be able to follow it. Let's say you have the following simple arithmetic grammar: P → S # the start rule S → S + M |