[Ur] Drop of several orders of magnitude in Techempower benchmarks

Mon Aug 5 17:17:45 EDT 2019

Update! The good news:
I was able to update the Dockerfile to build Ur/web from the latest release
tarball (basically, using the old round 16 Dockerfile with a couple of
small fixes like installing libicu-dev) and compare the benchmarks with the
version installed with apt from the Ubuntu repo. The version built from the
latest release was over ten times faster, even running on my old laptop.

The bad news:
The latest version of Ur appears to fail the "fortunes" test with the
following diff (there is more, but this seems to explain it):

fortune: -<tr><td>6</td><td>Emacs is a nice operating system, but I prefer
UNIX. — Tom Christaensen</td></tr>
fortune: +<tr><td>6</td><td>Emacs is a nice operating system, but I prefer
UNIX.  Tom Christaensen</td></tr>
fortune: @@ -17 +17 @@

fortune: -<tr><td>12</td><td>フレームワークのベンチマーク</td></tr>
fortune: +<tr><td>12</td><td></td></tr>

It would seem that Unicode characters are being stripped from the output,
causing the test to fail. I'm not familiar with exactly what the test is
trying to do, and I don't know much about how Ur handles UTF-8.

If you can advise on how to fix this, I'd be happy to open a PR on the
Techempower benchmarks repo with my changes.

Oisín

On Mon, 5 Aug 2019 at 19:43, Oisín Mac Fhearaí <denpashogai at gmail.com>
wrote:

> Although I'm no closer to understanding why performance seems to have
> dropped in the benchmarks, thanks to a couple of comments on the Github
> issue I was able to find some more detailed logs of the test runs.
>
> Fortunes, Round 16:
> https://tfb-logs.techempower.com/round-16/final/citrine-results/urweb/fortune/raw.txt
> Fortunes, Round 17:
> https://tfb-logs.techempower.com/round-17/final/citrine-results/20180903024112/urweb/fortune/raw.txt
>
> I'm amazed by the difference in request latencies:
>
> Round 16:
>   Thread Stats   Avg      Stdev     Max   +/- Stdev
>     Latency   235.27us  140.45us   1.80ms   90.30%
>     Req/Sec     4.36k   148.24     4.89k    72.06%
>
> Round 17:
>   Thread Stats   Avg      Stdev     Max   +/- Stdev
>     Latency    16.29ms   39.60ms 327.45ms   95.34%
>     Req/Sec   123.38     20.22   141.00     95.00%
>
> In both cases, the web service is being hit by the "wrk" load tester, with
> the exact same parameters.
>
> The only differences I can think of, then, are that the round 17 Ur/web
> Dockerfile installs urweb via the apt package manager, whereas the round 17
> Dockerfile directly downloads an old tarball from 2016. But I've tested the
> latest Ubuntu version on my laptop and it performs almost exactly the same
> as the latest version from Git. So why does the round 17 benchmark have a
> max latency of 327 ms compared to under 2 ms in the previous round?
>
> So confuse.
>
> On Fri, 2 Aug 2019 at 21:45, Oisín Mac Fhearaí <denpashogai at gmail.com>
> wrote:
>
>> I tried cloning the latest version of the benchmarks to run the Urweb
>> tests locally, but sadly the Docker image fails to build for me (due to a
>> problem with the Postgres installation steps, it seems). I've opened an
>> issue here:
>> https://github.com/TechEmpower/FrameworkBenchmarks/issues/4969 ... I
>> also asked for advice on how to track down the massive performance drop in
>> the Urweb tests. Hopefully they might have some thoughts on it. Sadly I'm
>> running things on a 9 year old laptop so it's hard to draw conclusions
>> around performance...
>>
>> On Thu, 1 Aug 2019 at 13:23, Adam Chlipala <adamc at csail.mit.edu> wrote:
>>
>>> I'm glad you brought this up, Oisín.  I was already thinking of
>>> appealing to this mailing list, in hopes of finding an eager detective to
>>> hunt down what is going on!  I can say that I can achieve much better
>>> performance with the latest code on my own workstation (similar profile to
>>> *one* of the several machines used by TechEmpower), which leads me to
>>> think something basic is getting in the way of proper performance in the
>>> benchmarking environment.
>>> On 7/31/19 8:06 PM, Oisín Mac Fhearaí wrote:
>>>
>>> I've noticed that Ur/web's performance benchmarks on Techempower have
>>> changed significantly between round 16 and 17.
>>>
>>> For example, in round 16, Urweb measured 323,430 responses per second to
>>> the "Fortunes" benchmark.
>>> In round 17 (and beyond), it achieved 4,024 RPS with MySQL and 2,544 RPS
>>> with Postgres.
>>>
>>> What could explain such a drastic drop in performance? The blog entry
>>> for round 17 mentioned query pipelining as an explanation for some of the
>>> frameworks getting much faster, but I don't see why Urweb's RPS would drop
>>> by a factor of 100x, unless perhaps previous rounds had query caching
>>> enabled and then round 17 disabled them.
>>>
>>> Can anyone here shed light on this? I built a simplified version of the
>>> "sql" demo with the 2016 tarball version of Ur (used by the round 16
>>> benchmarks) and a recent snapshot, and they both perform at similar speeds
>>> on my laptop.
>>>
>>> Oddly, the load testing tool I used (a Go program called "hey") seems to
>>> have one request that takes 5 seconds if I set it to use more concurrent
>>> threads than the number of threads available to the Ur/web program.
>>> Otherwise, the longest request takes about 0.02 seconds. This seems
>>> unrelated to the performance drop on the Techempower benchmarks, since the
>>> max latency is quite low there.
>>>
>>> _______________________________________________
>>> Ur mailing list
>>> Ur at impredicative.com
>>> http://www.impredicative.com/cgi-bin/mailman/listinfo/ur
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.impredicative.com/pipermail/ur/attachments/20190805/f01a01f8/attachment-0001.html>