[squid-dev] Sad performance trend
Alex Rousskov
rousskov at measurement-factory.com
Mon Sep 12 17:36:31 UTC 2016
On 09/12/2016 09:38 AM, Amos Jeffries wrote:
> On 7/09/2016 5:43 a.m., Alex Rousskov wrote:
>> On 09/06/2016 08:27 AM, Amos Jeffries wrote:
>>> On 27/08/2016 12:32 p.m., Alex Rousskov wrote:
>>>> W1 W2 W3 W4 W5 W6
>>>> v3.1 32% 38% 16% 48% 16+ 9%
>>>> v3.3 23% 31% 14% 42% 15% 8%
>>>> v3.5 11% 16% 12% 36% 7% 6%
>>>> v4.0 11% 15% 9% 30% 14% 5%
> Since the test was a bit unreliable I ran it freshly against each branch
> that would build when I wanted to check progress.
> The last test run can be found in parserng-polygraph if you want to dig
> into the logs for other measures.
Sorry, I do not know how to get to "parserng-polygraph". All Jenkins
results published at http://build.squid-cache.org/ appear to be too old,
but perhaps I am looking in the wrong place. It is probably not
important for this specific discussion though.
> branch : Mean RPS
> --------------------
> squid-3.2 : 1933.98
> squid-3.3 : 1932.81
> squid-3.4 : 1931.12
> squid-3.5 : 1926.13
Looks like performance may be gradually getting worse in this macro test
as well, but the decrease is not as visible/pronounced as in micro tests
(which is understandable/expected, of course).
> The fluctuation / error bars seemed to be about 1 RPS for that polygraph
> workload.
Please correct me if I am misinterpreting what you are saying, but to me
it sounds like "The results are not getting gradually worse because each
result has a ~1 rps precision". That conclusion does not compute for me
for two independent reasons:
1. There is a significant difference between results fluctuation and a
trend. Individual test results fluctuation may co-exist with a
real/meaningful downward trend. Whether that happens here, I do not
know, but your comment appears to imply that you are dismissing the
visible trend for the wrong/inapplicable reason.
2. Even if 1 unit difference is insignificant, the results show that the
performance got worse by 7+ units (~1934 vs ~1926).
BTW, I would caution against thinking of these numbers as RPS. That test
is not designed to predict sustained request rates. A 1 unit difference
in these results may correspond to ~0% or, say, ~10% difference in
actual sustained request rates.
Alex.
More information about the squid-dev
mailing list