[Ur] Combinator Parsing in Ur: Anonymous function remains at code generation

orchidaceae phalaenopsis orchid.hybrid at gmail.com
Tue Jul 22 15:09:10 EDT 2014


Dear Adam,

Thank you very much for fixing the bug to allow this test-case to
build! It's really exciting to Ur/Web grow and improve.

Sorry to say but I could not continue to build my BBCode parser, I
added a new combinator:

fun choice [a] (p : parser a) (q : parser a) : parser a =
    fn input => case p input of
                    Empty Failure => q input
                  | Empty ok => (case q input of
                                     Empty _ => Empty ok
                                   | consumed => consumed)
                  | consumed => consumed

and

val bbid : parser bbtag =
    choice
        (_ <- char #"b" ; return B)
        (_ <- char #"u" ; return U)

triggers the anonymous function problem.

Also to simulate lazy evaluation (which is essential in this parser),
I will need to change

    datatype consumed a = Consumed of reply a
                        | Empty of reply a

to delay the replies

    datatype consumed a = Consumed of unit -> reply a
                        | Empty of unit -> reply a

later on. These are higher order objects that I don't think could be
removed completely during compiling?

This is a Parsec style parser combinator library, I also tried ReadP
style and while the combinators are accepted by themselves I think it
is even higher order than Parsec so it wouldn't work in practice - but
a combinator library is a really nice way to do parsing once it was
working nicely it could be part of the standard library maybe?

In the meantime, since it seems like it could take some work on the
compiler, I've thought about other possibilities for parsing in urweb.
Now I have seen other people do things like this
https://github.com/thoughtpolice/ur-sundown (see in particular the
NOTA BENE section) but not in a type safe way.

A) Somehow retarget ml-lex/ml-yacc to generate valid Ur (too horrible
to think about it further)
B) Retarget a parser expression grammar compiler like peg-bootstrap or
SMLPEG to Ur: This means one would have to make it generate pure (both
implementations use mutable cells), first order code without monads -
so you really have your hands tied but it should be possible.
C) Create the most core of the parsing monad in the FFI, and implement
the rest of the parser combinators in terms of that: At first this
seemed impossible because I cannot define polymorphic functions in the
FFI, but it might be possible to do it using the transaction monad.
This way isn't very promising because it might still run into the
anonymous functions problem.

So I think (B) is the best option for now and I will try to explore it
a little more.

[1]: https://github.com/kragen/peg-bootstrap/blob/master/peg.md
[2]: https://github.com/standardml/SMLPEG



More information about the Ur mailing list