|LRSTAR: LR(*) parser generator for C++|
|Home Downloads Feedback Comparison Theoretical Documentation Contact|
New In Version 11.0
(1) The LR(*) nondeterministic parsing has been greatly improved.
LRSTAR creates an LR(*) parser, if necessary. If your grammar is LALR(1), then it creates an LALR(1) parser. If you grammar is LR(1), then it creates an LR(1) parser. If your grammar is neither, because of conflicts, it will create an LR(*) parser, if you set k to 2, 3 or more. LRSTAR will tell you whether your grammar is LALR(1), LR(1) or neither. An LR(*) parser does lookahead at parse time to resolve conlicts. Therefore, if the language you are trying to parse is LR(k), where k > 1, you should be able to parse it. In our test file it needed to lookahead 13 tokens in one situation. The LR(*) parsing algorithm is efficient and does not impact the parsing speed very much.
DFA Lexical Analyzers
A DFA lexer is a finite-state automata. DFA is the fastest recognition algorithm (5 times the speed of PDA's). DFA's work fine for most programming languages. You may include the keywords of your language in the lexical grammar which makes keyword recognition as fast as possible. The data structure created by DFA is a compressed-matrix. See the comparison page to find out how DFA compares to flex and re2c.
Parsing speed is comparable to Yacc/Bison generated parsers. However, LRSTAR can generate slightly faster parsers, if you use the optimize option (/o). Probably, the speed advantage goes to LRSTAR, because it generates parsers which automatically build a symbol table and an abstract-syntax tree (AST). Compared to ANTLR? I did a test of ANTLR 4.8 using the C++ target and discovered that the ANTLR parser was taking 15 seconds, whereas the LRSTAR parser was taking 0.1 seconds, to process a C source code file of 227,000 lines.
The "typedef" declaration in the C and C++ languages is a context sensitive issue. This cannot be solved by upgrading from LALR(1) to LR(1) or LR(k). It requires making use of a symbol table while parsing and this allows even an LALR(1) parser to handle this context sensitive issue. LRSTAR has this feature built into the grammar notation, making life easy for a user.
|(c) Copyright Paul B Mann 2020. All rights reserved.|