Main Page | See live article | Alphabetical index

Top-down parsing

A compiler parses input from a programming language to assembly language or an internal representation by matching the incoming symbols to BNF production rules. LL parsing, also called top-bottom parsing or top-down parsing, attempts applies each production rule to the incoming symbols by working from the left-most symbol yielded on a production rule and then proceeding to the next production rule for each non-terminal symbol encountered. In this way the parsing starts on the Left of the result side (right side) of the production rule and evaluations non-terminals from the Left first and, thus, proceeds down the parse tree for each new non-terminal before continuing to the next symbol for a production rule.
e.g. would match A->aB and attempt to match B->c|cd next. Then C->df|eg would be tried. As one may expect, some languages are more ambiguous than others. For a non-ambiguous language in which all productions for a non-terminal produce distinct strings: the string produced by one production will not start with the same symbol as the string produced by the another production. A non-ambiguous language may be parsed by an LL(1) grammar where the (1) signifies the parser reads ahead one token at a time. For an ambiguous languages to be parsed by an LL parser then the parser must lookahead >1 symbol, e.g. LL(3).
The common solution is to use an RR, or bottom-up or shift-reduce, parser.