Parser combinators for Clojure(Script).
- Idiomatic and convenient API for parser combinators in Clojure and ClojureScript.
As far as there is no comprehensive documentation how to use parsesso
there
are another resources to get familiar with idea of parser combinators in Clojure:
Parsesso | Parsec1,2,3 | Kern4 | Parsatron5 |
---|---|---|---|
p/do-parser |
fwd |
defparser |
|
p/result |
return |
return |
always |
p/fail |
fail |
fail |
never |
p/fail-unexpected |
unexpected |
unexpected |
|
p/expecting |
<?> , label |
<?> , expect |
|
p/bind |
>>= |
>>= |
bind |
p/for |
do |
bind |
let->> |
p/after |
>> |
>> |
>> , nxt |
p/value |
fmap |
<$> |
|
p/maybe |
try |
<:> |
attempt |
p/look-ahead |
lookAhead |
look-ahead |
lookahead |
p/not-followed-by |
notFollowedBy |
not-followed-by |
|
p/*many |
many |
many |
many |
p/+many |
many1 |
many1 |
many1 |
p/*skip |
skipMany |
skip-many |
|
p/+skip |
skipMany1 |
skip-many1 |
|
p/token |
token , satisfy |
satisfy |
token |
p/token-not |
|||
p/word |
tokens , string |
token* |
string |
p/any-token |
anyToken ,anyChar |
any-char |
any-char |
p/eof |
eof |
eof |
eof |
p/group |
<*> |
<*> |
|
p/alt |
<|> , choice |
<|> |
choice |
p/option |
option , optional |
option , optional |
|
p/between |
between |
between |
between |
p/times |
count |
times |
times |
p/*many-till |
manyTill |
many-till |
|
p/*sep-by |
sepBy |
sep-by |
|
p/+sep-by |
sepBy1 |
sep-by1 |
|
p/*sep-end-by |
endBy |
end-by |
|
p/+sep-end-by |
endBy1 |
end-by1 |
|
p/*sep-opt-by |
sepEndBy |
sep-end-by |
|
p/+sep-opt-by |
sepEndBy1 |
sep-end-by1 |
|
p/get-state |
getParserState ... |
input, pos, user state | |
p/set-state |
setParserState ... |
input, pos, user state | |
p/update-state |
updateParserState ... |
user state | |
p/trace |
parserTrace , parserTraced |
||
expr/*chain-left |
chainl |
chainl |
|
expr/+chain-left |
chainl1 |
chainl1 |
|
expr/*chain-right |
chainr |
chainr |
|
expr/+chain-right |
chainr1 |
chainr1 |
|
char/is |
char , oneOf |
sym* , one-of* |
char |
char/is-not |
noneOf |
none-of* |
|
char/regex |
|||
char/upper? |
upper |
upper (unicode) |
|
char/lower? |
lower |
lower (unicode) |
|
char/letter? |
letter |
letter (unicode) |
letter (unicode) |
char/number? |
digit |
digit (unicode) |
digit (unicode) |
char/letter-or-number? |
alphaNum |
alpha-num (unicode) |
|
char/white? |
space |
white-space (unicode) |
|
char/newline |
endOfLine |
new-line* |
|
char/str* |
<+> |
See some benchmarks here.
What parser combinators are & are good for? How does it differ e.g. from Instaparse, which also parses text into data?
A parser combinator library is a library with functions that can be composed into a parser. Instaparse takes a grammar specification, but in a parser combinator library you build the specification from functions, rather than a DSL.
When should I pick parser combinators over EBNF? Do they offer the same, and it is only question of which one I prefer to learn or is there some distinct advantage over a DSL such as EBNF? Perhaps it is easier to describe more complex grammars b/c I can make my own helper functions, or something?
In general, parser combinators such as parsesso
are for creating top-down
(i.e. LL) parsers, with the ability to reuse common code (this lib). Parser
Generators typically generate a finite state automaton for a bottom-up (LR)
parser. Though nowadays there are also combinators for LR grammars and
generators for LL ones (e.g. ANTLR). Which one you should use, depends on how
hard your grammar is, and how fast the parser needs to be. Especially if the
grammar has lot of non-trivial ambiguities then it might be easier with the more
flexible combinators approach.
- Michiel Borkent
- Compatibility with babashka.
- Github CI configuration.
- Clj-kondo configuration tips.
- Jakub Holý
- Questions and answers in FAQ.