skip to main content
10.1145/3623476.3623521acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

A Reference GLL Implementation

Published: 23 October 2023 Publication History

Abstract

The Generalised-LL (GLL) context-free parsing algorithm was introduced at the 2009 LDTA workshop, and since then a series of variant algorithms and implementations have been described. There is a wide variety of optimisations that may be applied to GLL, some of which were already present in the originally published form.
This paper presents a reference GLL implementation shorn of all optimisations as a common baseline for the real-world comparison of performance across GLL variants. This baseline version has particular value for non-specialists, since its simple form may be straightforwardly encoded in the implementer's preferred programming language.
We also describe our approach to low level memory management of GLL internal data structures. Our evaluation on large inputs shows a factor 3--4 speedup over a naïve implementation using the standard Java APIs and a factor 4--5 reduction in heap requirements. We conclude with notes on some algorithm-level optimisations that may be applied independently of the internal data representation.

References

[1]
Alfred V. Aho and Jeffrey D. Ullman. 1972. The Theory of Parsing, Translation, and Compiling. Prentice-Hall, Inc., USA. isbn:0139145567
[2]
GNU. 2023. New C parser. https://gcc.gnu.org/wiki/New_C_Parser Accessed: 2023-07-06
[3]
GNU. 2023. New C parser [patch]. https://gcc.gnu.org/legacy-ml/gcc-patches/2004-10/msg01969.html Accessed: 2023-07-06
[4]
Anastasia Izmaylova, Ali Afroozeh, and Tijs van der Storm. 2016. Practical, General Parser Combinators. In Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM ’16). Association for Computing Machinery, New York, NY, USA. 1–12. isbn:9781450340977 https://doi.org/10.1145/2847538.2847539
[5]
Adrian Johnstone and Elizabeth Scott. 2011. Modelling GLL Parser Implementations. In Software Language Engineering, Brian Malloy, Steffen Staab, and Mark van den Brand (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 42–61. isbn:978-3-642-19440-5
[6]
Adrian Johnstone and Elizabeth Scott. 2015. Principled software microengineering. Science of Computer Programming, 97 (2015), 64–68. issn:0167-6423 https://doi.org/10.1016/j.scico.2013.11.018 Special Issue on New Ideas and Emerging Results in Understanding Software
[7]
Paul Klint, Tijs van der Storm, and Jurgen Vinju. 2009. RASCAL: A Domain Specific Language for Source Code Analysis and Manipulation. In 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation. 168–177. https://doi.org/10.1109/SCAM.2009.28
[8]
Scott McPeak and George C. Necula. 2004. Elkhound: A Fast, Practical GLR Parser Generator. In Compiler Construction, Evelyn Duesterwald (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 73–88. isbn:978-3-540-24723-4 https://doi.org/10.1007/978-3-540-24723-4_6
[9]
Thomas J. Pennello. 1986. Very Fast LR Parsing. In Proceedings of the 1986 SIGPLAN Symposium on Compiler Construction (SIGPLAN ’86). Association for Computing Machinery, New York, NY, USA. 145–151. isbn:0897911970 https://doi.org/10.1145/12276.13326
[10]
Elizabeth Scott and Adrian Johnstone. 2010. GLL Parsing. Electronic Notes in Theoretical Computer Science, 253, 7 (2010), 177–189. issn:1571-0661 https://doi.org/10.1016/j.entcs.2010.08.041 Proceedings of the Ninth Workshop on Language Descriptions Tools and Applications (LDTA 2009)
[11]
Elizabeth Scott and Adrian Johnstone. 2013. GLL parse-tree generation. Science of Computer Programming, 78, 10 (2013), 1828–1844. issn:0167-6423 https://doi.org/10.1016/j.scico.2012.03.005 Special section on Language Descriptions Tools and Applications (LDTA’08 & ’09)
[12]
Elizabeth Scott and Adrian Johnstone. 2016. Structuring the GLL parsing algorithm for performance. Science of Computer Programming, 125 (2016), 1–22. issn:0167-6423 https://doi.org/10.1016/j.scico.2016.04.003
[13]
Elizabeth Scott and Adrian Johnstone. 2018. GLL syntax analysers for EBNF grammars. Science of Computer Programming, 166 (2018), 120–145. issn:0167-6423 https://doi.org/10.1016/j.scico.2018.06.001
[14]
Elizabeth Scott and Adrian Johnstone. 2019. Multiple Lexicalisation (a Java Based Study). In Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2019). Association for Computing Machinery, New York, NY, USA. 71–82. isbn:9781450369817 https://doi.org/10.1145/3357766.3359532
[15]
Elizabeth Scott, Adrian Johnstone, and L. Thomas van Binsbergen. 2019. Derivation representation using binary subtree sets. Science of Computer Programming, 175 (2019), 63–84. issn:0167-6423 https://doi.org/10.1016/j.scico.2019.01.008
[16]
Elizabeth Scott, Adrian Johnstone, and Robert Walsh. 2023. Multiple Input Parsing and Lexical Analysis. ACM Trans. Program. Lang. Syst., 45, 3 (2023), Article 14, jul, 44 pages. issn:0164-0925 https://doi.org/10.1145/3594734
[17]
Daniel Spiewak. 2023. gll-combinators. https://index.scala-lang.org/djspiewak/gll-combinators Accessed: 2023-09-05
[18]
StackOverflow. 2023. Are GCC and Clang parsers really handwritten? https://stackoverflow.com/questions/6319086/are-gcc-and-clang-parsers-really-handwritten Accessed: 2023-07-06
[19]
Bjarne Stroustrup. 1995. The Design and Evolution of C++. ACM Press/Addison-Wesley Publishing Co., USA. isbn:0201543303
[20]
Masaru Tomita. 1985. An Efficient Context-Free Parsing Algorithm for Natural Languages. In Proceedings of the 9th International Joint Conference on Artificial Intelligence - Volume 2 (IJCAI’85). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 756–764. isbn:0934613028
[21]
L. Thomas van Binsbergen, Elizabeth Scott, and Adrian Johnstone. 2018. GLL Parsing with Flexible Combinators. In Proceedings of the 11th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2018). Association for Computing Machinery, New York, NY, USA. 16–28. isbn:9781450360296 https://doi.org/10.1145/3276604.3276618

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SLE 2023: Proceedings of the 16th ACM SIGPLAN International Conference on Software Language Engineering
October 2023
231 pages
ISBN:9798400703966
DOI:10.1145/3623476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GLL implementation
  2. GLL parsers
  3. Programming language syntax specification

Qualifiers

  • Research-article

Conference

SLE '23
Sponsor:

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 60
    Total Downloads
  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)4
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media