[Ur] C type for Ur/Web list type

Artyom Shalkhakov artyom.shalkhakov at gmail.com
Thu Apr 14 07:35:21 EDT 2016


2016-04-14 17:24 GMT+06:00 Adam Chlipala <adamc at csail.mit.edu>:
> On 04/14/2016 07:14 AM, Artyom Shalkhakov wrote:
>>
>> 2016-04-14 17:02 GMT+06:00 Adam Chlipala <adamc at csail.mit.edu>:
>>>
>>> On 04/14/2016 06:53 AM, Artyom Shalkhakov wrote:
>>>>
>>>> there is this question: what C types do [list string] and [option
>>>> string]
>>>> map to? I think that [option string] probably maps to a nullable pointer
>>>> to
>>>> uw_Basis_string. What about the list constructor?
>>>
>>>
>>> That's a tricky one.  There is actually no support for parametric
>>> polymorphism in the FFI.  Every [list a] type is compiled to a separate C
>>> struct.
>>>
>> Okay. The use-case is this: I'm decoding/encoding URI parameters in an
>> Ur/Web program, naively. That involves a lot of string concatenation.
>> Is there another (preferably external-dependency-free) approach to
>> this?
>
>
> Have you tried using normal Ur/Web code for that operation?  I think there's
> a nontrivial chance that you'll get better performance that way than with C
> FFI code, because the compiler has special optimizations for string
> concatenation.  They kick in when that concatenation flows directly into the
> value that is being returned as the HTTP body, whether it's HTML or a string
> in a blob.
>

The code is as follows:

> fun
> uri_encode (s:string):string = let
>   fun
>   tohexstr i = let
>     val low = i % 16
>     val high = i / 16
>     fun hexdigit i =
>         case i of
>           0 => "0" | 1 => "1" | 2 => "2" | 3 => "3" | 4 => "4" | 5 => "5" | 6 => "6" | 7 => "7" | 8 => "8" | 9 => "9"
>         | 10 => "A" | 11 => "B" | 12 => "C" | 13 => "D" | 14 => "E" | 15 => "F"
>         | _ => error <xml>tohexstr: invalid digit {[i]}</xml>
>   in
>     hexdigit high ^ hexdigit low
>   end
>
>   fun
>   aux i n s acc =
>   if i < n then let
>       val c = strsub s i
>     in
>       (* NOTE: strcat seems to be QUITE inefficient here *)
>       if isalnum c || Option.isSome (String.index ";,/?:@&=+$#" c) then
>         aux (i+1) n s (strcat acc (str1 c))
>       else
>         aux (i+1) n s (strcat acc (strcat "%" (tohexstr (ord c))))
>     end
>   else acc
>   val res = aux 0 (strlen s) s ""
> in
>   res
> end
>
> fun
> uri_decode (s: string): string = let
>   fun
>   aux i n s acc =
>   if i < n then let
>       val c = strsub s i
>     in
>       (* NOTE: strcat seems to be QUITE inefficient here *)
>       if c = #"%" then (
>         if i+1 >= n || i+2 >= n then error <xml>decode: premature EOS</xml>
>         else let
>             val c1 = strsub s (i+1)
>             val c2 = strsub s (i+2)
>             val digit1 =
>                 if c1 >= #"a" && c1 <= #"f" then ord c1 - ord #"a" + 10
>                 else if c1 >= #"A" && c1 <= #"F" then ord c1 - ord #"A" + 10
>                 else ord c1 - ord #"0"
>             val digit2 =
>                 if c2 >= #"a" && c2 <= #"f" then ord c2 - ord #"a" + 10
>                 else if c2 >= #"A" && c2 <= #"F" then ord c2 - ord #"A" + 10
>                 else ord c2 - ord #"0"
>             val c0 = chr (digit1 * 16 + digit2)
>           in
>             aux (i+3) n s (strcat acc (str1 c0))
>           end
>         )
>       else if c = #"+" then aux (i+1) n s (strcat acc " ")
>       else aux (i+1) n s (strcat acc (str1 c))
>     end
>   else acc
> in
>   aux 0 (strlen s) s ""
> end

It seems to work fine in my tests.

I'm just worried that concatenating strings this way is inefficient
(I haven't actually looked at the generated code yet). I will take
a look at the generated C code and get back here with the results.

> It's probably possible to get those optimizations into play in other
> situations, too, with a moderate amount of compiler work.
>
>
> _______________________________________________
> Ur mailing list
> Ur at impredicative.com
> http://www.impredicative.com/cgi-bin/mailman/listinfo/ur



-- 
Cheers,
Artyom Shalkhakov



More information about the Ur mailing list