[squid-dev] [RFC] Support concurrent SBuf::c_str() calls

Kinkie gkinkie at gmail.com
Sun Oct 2 21:25:24 UTC 2016


On Fri, Sep 30, 2016 at 6:03 PM, Alex Rousskov
<rousskov at measurement-factory.com> wrote:
> On 09/29/2016 09:19 PM, Amos Jeffries wrote:
>> On 30/09/2016 5:03 a.m., Alex Rousskov wrote:
>>> Should we remove the increment to make concurrent c_str() calls safe?
>
>> The reason it exists remember is to prevent other SBuf sharing that
>> storage MemBuf from thinking they can append to the storage oer top of
>> the terminator.
>
> Yes, I know, but that increment not only prevents problems but _causes_
> them as well: The increment may force the SBuf to realloc, which may
> result in deallocation of the original storage still pointed to by the
> earlier c_str() result.
>
>
>> Given that we have no easy knowledge of how the c_str()
>> is going to be used by the recipient code we cannot guarantee lack of
>> terminator is safe.
>
> Yes, with or without the increment, Squid may crash or misbehave when
> buggy code appends to SBuf while a previous c_str() result is still in
> use. The lack of increment is not safe. The presence of the increment is
> not safe. The problems are different, but they exist with or without the
> increment.
>
>
>> Time to push SBuf usage forward and aggressively remove the need for
>> c_str()?
>
> The need for c_str() will never go away because libraries outside of
> Squid control need c-strings. And, due to the lack of proper QA, any
> "aggressive push" affecting lots of Squid code is a recipe for a
> disaster IMO.
>
> The question is which c_str() implementation is safer? The one that may
> crash Squid when called twice for the same SBuf in the same expression
> or the one that may crash Squid when appending to the same SBuf in the
> same expression? There are already two known cases of the former. There
> are currently no known cases of the latter.
>
>
> Overall, I know of three primary ways to implement c_str():
>
> 1. Always 0-terminate SBuf-used storage. Safest implementation possible,
> but has serious negative performance overheads, especially for tokenizing.

If this is the decision, might as well drop sbuf altogether in favor
of std::string (or possibly std::string + custom mempools-friendly
Allocator).

> 2. Terminate the used storage at the time of c_str().
>
>     2a. Increment the used storage size. Current implementation.
>         Leads to problems discussed in this thread.
>
>     2b. Do not increment the used storage size.
>         Leads to other problems discussed in this thread.
>
> 3. Allocate and store a new c-string as needed, copying from the
> storage. Safest implementation possible but has serious negative
> performance overheads. These overheads affect different use cases than
> those of #1.

If we are aware of the underlying issue, there's no reason not to save
the temporary to a naked char* and use that, as long as the SBuf is
not touched it'll work fine.

> Needless to say, we can add clever code to reduce risks associated with
> some implementations. For example (these are just rough sketches/ideas):
>
> a) Appending code in #2b can check whether it is about to overwrite the
> 0-terminator added by a previous c_str() call and, if yes, preserve the
> old storage (we can preserve one such storage or maintain a global FIFO
> queue).
>
> b) C-string allocating code in #4 can maintain a global FIFO queue of
> previously allocated but now supposedly unused c-strings to minimize the
> risk of de-allocating a still used c-string.
>
> c) C-string allocating code in #4 can delay allocation until it might be
> needed, similar to how (a) decides that the 0-terminator is in danger.
>
>
> Going forward, do you think we should:
>
> i. Keep using #2a and rewrite the known double-c_str() code.
> ii. Make double-c_str() code safe (i.e., switch to #2b).
> iii. Add code to make the existing #2 implementation safer.
> iiii. Switch to a completely different implementation like #1 or #4.

v. push SBuf as suggested by Amos and document this specific corner
case so to avoid it

-- 
    Francesco


More information about the squid-dev mailing list