[squid-dev] [RFC] Support concurrent SBuf::c_str() calls

Alex Rousskov rousskov at measurement-factory.com
Mon Oct 3 17:03:11 UTC 2016


On 10/02/2016 11:51 PM, Amos Jeffries wrote:
> On 3/10/2016 1:03 p.m., Alex Rousskov wrote:
>> On 10/02/2016 03:25 PM, Kinkie wrote:
>>> On Fri, Sep 30, 2016 at 6:03 PM, Alex Rousskov
>>>> Overall, I know of three primary ways to implement c_str():
>>>>
>>>> 1. Always 0-terminate SBuf-used storage. Safest implementation possible,
>>>> but has serious negative performance overheads, especially for tokenizing.

>>> If this is the decision, might as well drop sbuf altogether in favor
>>> of std::string (or possibly std::string + custom mempools-friendly
>>> Allocator).

>> You are probably right, although std::string may have other funny things
>> going that we might want to avoid/control.

> Yes. [...]

I am dropping this part of the discussion until somebody proposes to do
#1. The discussion contains wrong statements IMO but I will ignore them
in hope that they remain irrelevant (because nobody will do #1).



>>>> 2. Terminate the used storage at the time of c_str().
>>>>
>>>>     2a. Increment the used storage size. Current implementation.
>>>>         Leads to problems discussed in this thread.
>>>>
>>>>     2b. Do not increment the used storage size.
>>>>         Leads to other problems discussed in this thread.
>>>>
>>>> 3. Allocate and store a new c-string as needed, copying from the
>>>> storage. Safest implementation possible but has serious negative
>>>> performance overheads. These overheads affect different use cases than
>>>> those of #1.

>>> If we are aware of the underlying issue, there's no reason not to save
>>> the temporary to a naked char* and use that, as long as the SBuf is
>>> not touched it'll work fine.

> I imagined a cached const char* pointer inside SBuf that c_str()
> produces if it is unset. The cow() operations freeing that cached pointer.

A pointer to what? A pointer to buf() is approach #2. A pointer to
specially allocated storage is approach #3.

>>>> Going forward, do you think we should:
>>>>
>>>> i. Keep using #2a and rewrite the known double-c_str() code.
> 
> That is two options. I think we should fix the double-c_str() code
> regardless of any other changes.

Implementations #1, #2b, and #3 make double-c_str() code perfectly safe.
We should not rewrite it.


> These are definitely performance bugs
> with the 2nd call triggeringd reallocate regardless of other bad side
> effects.

Implementations #1, #2b, and #3 do not result in poor double-c_str()
performance. They make double-c_str() performance very close to a single
c_str() performance.


>>>> ii. Make double-c_str() code safe (i.e., switch to #2b).
>>>> iii. Add code to make the existing #2 implementation safer.
>>>> iiii. Switch to a completely different implementation like #1 or #4.

> There are very few things we can do will resolve this in such a short
> timeframe.

Agreed.


> Any choice we make to alter the design will take a long time to
> check every single code path works okay.

Not really -- if the API stays the same or becomes even safer, then the
impossible task of checking every single code path becomes unnecessary.


> Another option is:
> 
> vi. drop c_str() completely and use SBufToCstring() or SBufToString() as
> ways to obtain char*.

How will the proposed SBufToString() differ from the existing "c_str"?
If you are proposing that SBufToString() will allocate memory that the
calling code will have to clean up explicitly, then I am against that
idea because it makes the API worse, especially in the presence of
exceptions.


> Adjusting callers as necessary for those not to leak memory or be too
> bad for performance will be easier than a full code audit.

I do not think any of the other options require a "full code audit"
either so this is not a valid comparison point.

Alex.



More information about the squid-dev mailing list