[Ur] Regular expressions DSEL?

Ziv Scully zivscully at gmail.com
Mon Feb 20 04:48:20 EST 2017


By parametrizing over a type-level record, you can encode the names of
matched subgroups in the type of the regular expression. The interface
probably looks something like:

(* Abstract type of uncompiled regular expressions. *)
con t :: {Unit} (* names of captured subgroups *) -> Type

val literal :
    string -> t []
val any : t []
val concat : r1 ::: {Unit} -> r2 ::: {Unit} -> [r1 ~ r2] =>
    t r1 -> t r2 -> t (r1 ++ r2)
val star : r ::: {Unit} ->
    t r -> t r
val alt : r ::: {Unit} -> s1 ::: {Unit} -> s2 ::: {Unit} -> [r ~ s1] => [r
~ s2] => [s1 ~ s2] =>
    t (r ++ s1) -> t (r ++ s2) -> t (r ++ s1 ++ s2)
val capture : r ::: {Unit} ->
    nm :: Name -> [r ~ [nm]] => t r -> t (r ++ [nm])

(* Abstract type of compiled regular expressions *)
con compiled :: {Unit} -> Type

val compile : r ::: {Unit} ->
    t r -> compiled r

Obviously this probably won't exactly work. For instance, if the underlying
library requires explicit names for captured groups, you'll need something
more like

val capture : r ::: {Unit} -> nm :: Name -> [r ~ [nm]] =>
    {nm : string} -> t r -> t (r ++ [nm])

because there's no way to extract a string from a type-level Name. In fact,
this way has another benefit: the internal representation of the type can
just be a string! All the tracking of subgroups can happen at the type
level. You may not even need compile.

This interface assumes each subgroup may appear 0 or 1 times. You could
maybe keep track of the number of matches more precisely by using {Type}
instead of {Unit} and using a type class to implement type-level functions.

con t : {Type} (* maps names of captured subgroups to type-level
representation of number of matched fields *) -> Type

type one
type oneOrNone

class meet : Type -> Type -> Type
val meet_one_oneOrNone : weaken one oneOrNone oneOrNone
val meet_oneOrNone_one : weaken oneOrNone one oneOrNone
val meet_oneOrNone_oneOrNone : weaken oneOrNone oneOrNone oneOrNone

This makes the types more complicated, but I suspect (though I don't know
for sure) that Ur/Web will be able to infer the extra stuff in any concrete
(that is, monomorphic) use, so the user code will still look pretty. Here's
how some of the functions look with this. The place weaken gets used is alt.

val star : r ::: {Type} ->
    t r -> t (map (fn _ => oneOrNo) r)

val alt : r ::: {Type * Type * Type} -> s1 ::: {Type} -> s2 ::: {Type} ->
    [map fst3 r ~ s1] => [map snd3 r ~ s2] => [s1 ~ s2] =>
    $(map (fn (t1, t2, t3) => weaken t1 t2 t3) r)
    t (map fst3 r ++ s1) -> t (map snd3 r ++ s2)
    -> t (map thd3 r ++ map (fn _ => oneOrNo) (s1 ++ s2))

You could extend the same sort of idea to types like oneOrMore and
noneOrMore—useful if the underlying library can return a list of matches
inside a + or *—by adding more cases to weaken.

Here's a Haskell library that takes basically this approach:
https://github.com/sweirich/dth/tree/master/popl17/src

Hope this helps!
Ziv


On Mon, Feb 20, 2017 at 12:00 AM, Artyom Shalkhakov <
artyom.shalkhakov at gmail.com> wrote:

> 2017-02-20 10:47 GMT+06:00 <fold at tuta.io>:
>
>> Maybe just define another attribute in basis.urs?
>>
>> It would be much interesting to define a DSEL though. I have not seen any
>> attempts to create a DSEL for regular expressions.
>>
>>
> Indeed! And it seems like it would be a nice exercise as well having some
> practical uses to it.
>
> I envision something along the lines of:
>
> datatype RegExp = ... (* constructors *)
>
> type t (* abstract, not exported outside the module *)
> val compile : RegExp -> t (* the trusted function *)
> val matches : string -> t -> bool (* check if passed-in string matches the
> expression *)
>
> And the implementation might use something unsafe (like a string, or an
> actual regexp object in JS). Then we could offload the interpretation of
> regexps to an off-the-shelf library. The [compile] function would print and
> escape the AST to a regexp string, and the Ur/Web compiler might actually
> optimize all of this away.
>
> If we are to take care of matching subgroups, that would require a more
> complicated type. I guess that should be doable as well, but I don't see
> immediately how to handle this.
>
>
>>
>>
>> 20. Feb 2017 16:28 by artyom.shalkhakov at gmail.com:
>>
>>
>> Hello all,
>>
>> 'm doing form validation stuff, following examples on MDN. [1]
>>
>> It seems like Ur/Web doesn't handle regular expressions?
>>
>> I guess it should not be too diffcult to define a DSEL in Ur/Web (just
>> AST constructors and a few trusted functions: e.g. [compile : AST ->
>> regexp], where [regexp] is an abstract type, or maybe just a string :-))
>> and have it work at least in the browser.
>>
>> Could somebody point me in the right direction here?
>>
>> --
>> Cheers,
>> Artyom Shalkhakov
>>
>> [1] https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/
>> HTML5/Constraint_validation
>>
>>
>
>
> --
> Cheers,
> Artyom Shalkhakov
>
> _______________________________________________
> Ur mailing list
> Ur at impredicative.com
> http://www.impredicative.com/cgi-bin/mailman/listinfo/ur
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.impredicative.com/pipermail/ur/attachments/20170220/6a524d7e/attachment.html>


More information about the Ur mailing list