[squid-dev] Squid: Small packets and low performance between squid and icap

Alex Rousskov rousskov at measurement-factory.com
Wed Feb 17 22:39:11 UTC 2016


On 02/16/2016 05:55 PM, Prashanth Prabhu wrote:
>> * A call to c_str() should not increase SBuf::len_  either because it
>> does not add a new character to the SBuf object. That call just
>> terminates the underlying buffer.
> 
> Well, without an increment of the MemBlob::size_ (or with an increment
> to the SBuf::len_) this would have been OK. However, once that 'size_'
> is incremented, we have the SBuf concept of how much buffer is used
> being "out of sync" with the MemBlob's conception of buffer usage.

AFAICT, your definition of "sync" does not match the code's definition
of "sync". You do not like something that the code does on purpose.


> FWIW I don't quite understand how the NUL-char doesn't add a new
> character to the SBuf object. Yes, it is terminating the string, but
> in 'C' it is also a legit character.

By definition, c_str() does not add any new characters to the SBuf
object. SBuf::c_str() is defined to leave SBuf contents (i.e., the
characters inside SBuf that you can access using SBuf public interfaces)
unchanged. You may not like that, but that is how this method is defined
to work. You may propose to change it (after making sure all callers are
fine with your change), or you may propose to add a new method.

FWIW, SBuf::c_str() models std::string::c_str() behavior.


> So, unclear what we are
> attempting with this magic; or why. Seems to me, not incrementing
> 'len_' is a mistake.

In many contexts, incrementing len_ will add an [extra] NUL character to
SBuf, and that NUL character will then be passed to other callers,
logged, written to HTTP agents, etc. Also, calling SBuf::c_str() twice
for the same SBuf object will add two NUL characters...

It would be possible to add an SBuf::terminate() method (or similar)
that does add a NUL character to SBuf, but that would be a different
method, used for different purposes.


>> Single-owner optimizations aside (a known TODO), the above is the
>> desired behavior according to the documented c_str() guarantees:
> 
> Can you please explain or point me to a document that has more info
> about this "Single-owner" optimization?

Please see MemBlob.h:

> bool canAppend(const size_type off, const size_type n) const {
>     // TODO: ignore offset (and adjust size) when the blob is not shared?
>     return isAppendOffset(off) && willFit(n);
> }

The TODO instructions are not very precise so please be careful to
preserve canAppend() functionality if you decide to optimize this.


>>>      * The returned value points to an internal location whose contents
>>>      * are guaranteed to remain unchanged only until the next call
>>>      * to a non-constant member function of the SBuf object.
>>
>> In other words, we cannot allow some _other_ SBuf object to overwrite
>> our null-termination character in the MemBlob we share with that other SBuf.
>>
>> The high price for that strong guarantee is one of the reasons we should
>> avoid c_str() calls in Squid code.
> 
> Note that the issue that I have explained is from a mostly stock squid
> 3.5.1 codebase. This isn't stemming from new c_str() calls added to
> the codebase.

Yes, IIRC, the issue you have described stems from a single c_str()
call, but there are many more (and growing) number of such calls, often
with poorly understood performance side effects.


> I have posted the current changes that are currently working for the
> most part below. It uses current capacity size as a heuristic to bump
> size up. The diff also has some previous fixes, that were pointed out
> to me on the thread, embedded in it.


I will only comment on the change that seems to be directly related to
our recent discussion:

> +    } else if (buf.spaceSize() < buf.currentCapacity() / 2) {
> +        buf.reserveCapacity(CLIENT_REQ_BUF_SZ * 4);

>From performance point of view, this change is problematic long-term
because reserveCapacity() guarantees single-ownership of the underlying
MemBlob and that guarantee is both expensive and not needed here. We
need a space to read, not single ownership. This is why rawSpace() was
added, BTW, but it is a little awkward to use in this context.

I am not sure what the best solution is (I do not have enough time to
study all the interactions and side effects), but we may need to either

A) add a new SBuf method (that does not provide such a guarantee) or
B) change in reserveCapacity() to remove the guarantee.

A study of current reserveCapacity() uses should determine the best of
the two options (if those two options are the best ones available).
After the new method is added or reserveCapacity() is changed, we should
call the new method or reserveCapacity() unconditionally, eliminating
most of your other patch changes.


HTH,

Alex.



More information about the squid-dev mailing list