[squid-dev] [PATCH] Refactor wordlist to SBufList in acl/RegexData

Kinkie gkinkie at gmail.com
Thu Oct 27 07:02:25 UTC 2016


On Thu, Oct 27, 2016 at 5:10 AM, Alex Rousskov
<rousskov at measurement-factory.com> wrote:
> On 10/26/2016 05:18 PM, Kinkie wrote:
>>>> the attached patch refactors the use of wordlist to SBufList in
>>>> acl/RegexData.cc
>
>> -    while (wl != NULL) {
>> +    for (SBuf i : sl) {
>
> If possible, please avoid creating new SBufs by declaring "i" to be a
> constant reference to SBuf. It is probably possible unless you modify
> "i" -- I cannot tell for sure by looking at all the "i"s in the patch.

SBuf::c_str() is unfortunately non-const which limits applicability.

> Renaming "i" to something longer and more informative like
> "patternOrOption" or even "slowPatternOrOption" would be useful for
> those who try to understand what is going on and/or have to find that
> variable in a text full of other "i"s.

Done, renamed to configurationLineWord.


>> +    for (auto i : sl) {
>
> Same here, on all counts. Auto does not automatically adds "&" or "const".

I know, but this can't be changed due to c_str() below.

> What is the time difference in configuring, for example, one million of
> regexes using patched and unpatched code?

The results were so surprising I had to double-check correctness which
is confirmed.
For a 100000-lines regex acl, "squid -k parse" time went DOWN from 36
seconds to 0.7 on my Intel Core i5 Macbook air.
By checking SBuf stats in cachemgr, the calling code is actually quite
bad for SBuf allocations; SBufs are apparently mostly used to hold
temporary c-strings and are immediately freed. I suspect however
calling code some O(n^2) or worse behavior in wordlist, hence the big
win.

> I have not reviewed the patch closely. I have no objections.

Thanks for the insights you gave.

Merging the updated code.

-- 
    Francesco


More information about the squid-dev mailing list