[squid-dev] [PATCH] SBuf API improvements

Sat Jan 2 22:33:56 UTC 2016

On Sat, Jan 2, 2016 at 7:01 PM, Alex Rousskov
<rousskov at measurement-factory.com> wrote:
> On 01/01/2016 07:28 PM, Amos Jeffries wrote:
>> On 2015-12-31 05:29, Alex Rousskov wrote:
>>> Yes, you can remove protections, but that usually results in a different
>>> kind of surprises -- the ones you want to avoid the most. I am pretty
>>> sure that given a statistically significant sample, the amount of buggy
>>> code calling reserve(length() + n) would be significantly larger than
>>> the amount of buggy code code calling reserveSpace(n).
>>
>> Maybe I'm missing something over the last few days discussion, but why
>> does reserve(length() + n) imply bugs?
>
> I did not say imply, but this is just a basic design principle: When a
> developer is forced to compute something, the probability that the
> developer computes (or the reviewer reads) something wrong goes up. In
> this particular case, it is easy to forget to compute the sum or get the
> length() from the wrong buffer for example.

I am not that pessimistic; after all the proposal follows the
principle of minimal surprise, and adheres to std:: API.
I can't see any specifal benefit of having a single or a double method
for this purpose, there's pros and cons to either choice.
Given that, I'd stick to the simpler reserve() call for now, and see
if the need arises in the future.

>> Lets not. Except to confirm that the bundled C/C++ helpers are not and
>> will not depend on mempools or any other src/ code.
>
> I agree that bundled helpers should not depend on mempools (but probably
> not for the reasons you think they should not).
>
> As for the separation of ./lib and ./src/, it should be driven by things
> other than the names of those directories. If we want to write powerful
> C++ helpers, the current separation is wrong because it forces Kinkie to
> generalize SBuf-driven algorithms to work with std::strings.

I'm not convinced there'll be many more instances of this artifact, if
there will be any at all. For most needs, std:: containers and
algorithms will support all needs; urlencoding is something we need
due to the way we have designed the helper APIs and is about all I can
think of which is not in std:: that a helper will need.

And honestly, working this way was kinda fun ;) I hope that the
results will be good enough, and I believe it won't be the norm.

>> Nod. If anyone is looking at using heavyweight components of Squid (like
>> mempools, SBuf etc) in the helpers then it is probably a situation where
>> the helper is not a helper but a new mode of operation for Squid workers.
>
> I do not consider SBuf component "heavy" but perhaps our definition of
> "heavy" differs.

I do consider it somewhat heavy due to the fact that it needs services
from a few other subsystems.

>>>>> Since Squid is (or should be) using standard encodings, I suspect there
>>>>> are libraries and modules (for various programming languages) that
>>>>> already provide appropriate encoding support. We would not have to
>>>>> maintain any of those implementations.
>>>>
>>>> There are. However:
>>>
>>> Please note that I was only talking about "libraries and modules" to use
>>> with _helpers_, not with Squid proper.
>>>
>>>
>>>> - the ones I could find seem to be quite generic, we have different
>>>> needs in different contexts
>>>
>>> This item is too general to agree or disagree with.

Googling for "url decoding library c++" returns several results. The
vast majority seem to be variations on the code we use in rfc1738,
with a C interface, some rely on regexes, or sscanf.

>>>> - impedence mismatch. These are either C or use std::string; this
>>>> generally means two more data-copies if we wish to use SBuf (or
>>>> std::string with a custom allocator)
>>>
>>> I lost you here. If helpers wish to use SBuf, then they do not need 3rd
>>> party libraries and Squid does not need to make its SBuf algorithms
>>> generic. If helpers do not wish to use SBuf, then 3rd party libraries do
>>> not introduce any extra copies (from SBufs).
>>>
>>
>> Nod. IMO, the need for a third-party RFC1738/3986 coder library is
>> primarily for the helpers, not Squid itself. We already have this
>> adequate coder logic for Squid internals. Moving the helpers to a
>> third-party library we can then adapt our custom logic for better
>> suiting SBuf etc.
>>
>> This both helps with simplifying the helpers dependency on Squid code
>> and eases third-party helpers being written as they no longer have to
>> cut-n-paste our rfc1738 code.
>>
>> So, if you found a FOSS library that provides RFC1738/3986 encoding and
>> supports std:string please look at adopting it for the helpers/ code.
>
>
> I disagree that using two encoding libraries (one for Squid proper and
> one for bundled helpers) is the best solution. If possible, I would
> recommend using one encoding library for both Squid proper and helpers.
> That direction implies an SBuf-driven library and SBuf use in C++ helpers.

.. or decoupling from that as I have done thus far.

>>>>> However, let's assume that our ultimate goal[1] is to ship fast,
>>>>> production-quality helpers that are built without external dependencies
>>>>> (i.e., built using Squid libraries) and that Squid code is (or will be)
>>>>> using SBuf for most string and buffering operations.
>>>>>
>>>>> Quality helpers are likely to need a variety of string and buffering
>>>>> operations. Which of the following approaches is better long-term?
>>>>>
>>>>> A. Support a growing number of templated algorithms while making SBuf
>>>>> API more and more interchangeable with std::string (while at the same
>>>>> time providing a growing number of SBuf methods for raw buffer
>>>>> operations).
>>>>>
>>>>> B. Make SBufs and SBuf-driven algorithms easily available in helpers by
>>>>> making SBuf storage/memory backing (and any other current dependencies
>>>>> on Squid core) easily replaceable with something that does not need any
>>>>> integration with Squid core.
>>>>>
>>>>> To me, option B looks like the right overall direction.
>
>
>> Both A and B are bad.
>>
>> I prefer C; ensuring helper code is a simple as possible with as few
>> dependencies - *particularly* from the complex code within src/.
>
> Option C implies re-implementing encoding algorithms just to avoid
> dependencies. This may be the worst option of all three. No matter what
> you do, the helpers will depend on common encoding algorithms (and other
> things). Either you make the helpers depend on Squid code or on some
> external libraries. This logic applies to both C++ and non-C++ helpers
> (except non-C++ helpers should not depend on Squid code, of course).

I believe that our main disagreement is on the extent this will be
needed; I believe it won't be needed much (if anymore at all).

>> IMHO, our primary goal with these helpers is to provide good quality
>> demo code for others to use as template for writing their own. With
>> secondary goal of acting as a source of some ready-made helpers for
>> common installation needs.
>
> All three options (A, B, and C) can more-or-less satisfy the above vague
> criteria, although I doubt that "good quality demo helper code" should
> be written in C++.

c++11 is better than C here IMHO.

-- 
    Francesco