Incremental Scannerless Generalized LR Parsing
Master Project of Maarten Sijm


Abstract

The Scannerless Generalized LR (SGLR) parsing algorithm supports the development of composed languages seamlessly but does not support incremental parsing. The Incremental Generalized LR (IGLR) parsing algorithm, on the other hand, does not support the seamless composition of languages. This thesis presents the Incremental Scannerless Generalized LR (ISGLR) parsing algorithm and investigates the effects of combining the SGLR and IGLR parsing algorithms. While the algorithmic differences are orthogonal, the fact that scannerless parsing relies on non-deterministic parsing for disambiguation has a negative impact on incrementality. Nonetheless, we show that the ISGLR parsing algorithm performs better than the batch SGLR parsing algorithm in typical scenarios. On average, the ISGLR parser can reuse 99% of a previous parse result. When parsing from scratch, the ISGLR parser has a 24% run time overhead compared to the SGLR parser, but when parsing incrementally for changes that are smaller than 1% of the input size on average, it has a 9× speedup.

Thesis


Incremental Scannerless Generalized LR Parsing

Student: Maarten Sijm
Supervisor(s): Jasper Denkers, Eelco Visser
Defended: July 14, 2021