[Ur] TechEmpower Benchmarks

Adam Chlipala adamc at csail.mit.edu
Wed Dec 11 19:16:29 EST 2013


On 12/11/2013 04:06 PM, escalier at riseup.net wrote:
> I suppose I should have said "obvious changes that may turn out to be
> improvements".
>
>    
>> I'm curious how involving Nginx could improve performance
>>      
> In my rather brief and inexpert tests, Ur/Web + lighttpd had about 70% of
> pure Ur/Web performance.
>    

Interesting; so throwing at least one popular proxy in front doesn't 
bring magic performance improvements.  That's at least comforting from 
the perspective of not challenging my mental model of how efficient the 
Ur/Web HTTP binaries are.

I should check: 70% performance is better or worse than the baseline? :)

>> I've pushed a changeset to avoid all database operations for page
>> handlers that don't need the database.
>>      
> Wow. Dramatic improvement here too.
>
> If I make a dubious extrapolation, (new local)/(old local) = (hypothetical
> new i7)/(old i7), the hypothetical i7 performance I get on JSON
> serialization, for example, is ~47,000. That would be around 75% of Yesod
> performance (which I have supposed to be the framework most analogous to
> Ur/Web).
>    

Yesod is probably most similar in terms of the programming experience, 
but at run-time, I would expect the closest frameworks to be those based 
directly on C or C++!

You can take a look at the generated code for, e.g., '/json' URIs by 
running 'urweb' with the '-debug' flag.  The C source will then be in 
/tmp/webapp.c.  (I just pushed a changeset that adds an optimization 
that makes this code even more direct, though it doesn't seem to have 
any serious performance effect for any of the benchmarks.)

The '/json' handler is a function named like '__uwn_wrap_json_XXX'.  It 
does almost nothing:
- Send some hints about region-based memory management with 
uw_[begin|end]_region().  These should just be little more than single 
pointer bumps.
- Call the function to add the required HTTP headers.  Probably trivial 
running time here, though we most likely take on system call overhead to 
get the current time.  Writing the headers themselves is just copying 
into a mutable string buffer.
- Clear the mutable string buffer holding the page to return.
- Make a number of uw_write() calls to append content to that buffer.
- Twice call the string escaping function, which writes its output 
directly to the page buffer in an efficient manner.
- Call the runtime system function to return the current page buffer 
content with a particular MIME type.

So, I hope you agree that it's not obvious how any of this could be much 
faster, working directly with any reasonable C library for HTTP 
serving.  In contrast, understanding the run-time behavior of any 
Haskell program is probably much more complex.

Again, I'd be very interested in any help anyone is willing to offer on 
comparing the execution of the latest Ur/Web benchmark against one of 
the current winners for the 'plaintext' benchmark.  There could be 
runtime system inefficiencies that could be teased out in this way.  So 
far my personal motivation level hasn't reached the point where I'm 
willing to replicate the official benchmark setup on EC2, but I'd be 
very happy to provide help (and maybe even money) to someone else who 
would take charge of it!

P.S.: I also just learned that OpenSSL's random number generation is not 
thread-safe, which probably led to some segfaults during execution of 
the benchmarks that call Ur/Web's [rand].  This issue is fixed now by 
adding a lock in the Ur/Web library.



More information about the Ur mailing list