[squid-dev] [PATCH] implement RFC3986

Kinkie gkinkie at gmail.com
Wed Mar 16 17:33:03 UTC 2016


>> Will it be initialized at all? I'd expect that fromHexTable, which is
>> const and POD be simply laid out in the data segment and not require
>> initialization at all.
>
> Are you implying that
>
> (a) fromHexTable is a C++11 constexpr _and_
> (b) constexpr cannot suffer from initialization order problems?
>
> If yes, then you should declare fromHexTable as constexpr so that
> somebody does not accidentally change it to be something else. This will
> make (a) a strong explicit statement rather than an implied and fragile
> implication.

both are list of constants. No calls to any code of any kind.
toHexTable can offer a stronger guarantee than const char *: it's a
const char * const.
It can't be declared constexpr: g++ complains that the standard
doesn't allow that for strings.
fromHexTable is constexpr.

> However, I cannot find any C++ rule that guarantees the behavior you
> expect -- your fromHexTable (even if you add constexpr to it) does not
> seem to match any of the three items at:
>
>   http://en.cppreference.com/w/cpp/language/constant_initialization

To me they seem both cases of (quote)
3) Static or thread-local object (not necessarily of class type), that
is not initialized by a constructor call, if the object is
value-initialized or if every expression in its initializer is a
constant expression.

> Can you point me to some documentation that guarantees your expectation
> will be fulfilled? To avoid misunderstanding: I am not saying your
> expectation is wrong (it certainly sounds reasonable to me). I am only
> saying that I cannot find any confirmation that what you expect is
> actually guaranteed.

This stems from my understading of the meaning of the ".data" section
of ELF files (and similar sections of other binaries), which may be
partial or incomplete.

> [Please avoid "no initialization" terminology because it implies that
> the object is left uninitialized -- what you probably mean is that
> fromHexTable is initialized during C++ "constant initialization" phase.]

Yes.

>> I agree however with simply moving the tables in the .cc file,
>> clearing all doubts.
>
> AFAIK, moving those table definitions to .cc file does not magically
> help FromHex() callers in any way. Moreover, they were already in the
> .cc file in the previous patch.

I suspect you are right. I was under the impression that if code in a
translation unit is called, then that translation unit has been
initialized by then. I now understand that I am wrong in assuming
that.

> Moving FromHex() itself to .cc file does not magically help indirect
> FromHex() callers in any way either.
>
>
> Please forgive me for whining, but it feels like you are trying various
> random combinations and using me as a validation test: Does this work?
> No, then how about this? Or perhaps that? This is a bad approach not
> just because it wastes hours but because I am not a good validator and
> will miss bugs. The correct approach would be to write code that you do
> not just "expect" to work correctly (that's always implied) but can
> _prove_ (to yourself, but using C++ rules) to work correctly as far as
> initialization order is concerned.

Unfortunately initialization rules are quite hard for me to understand yet.

>> +const int16_t fromHexTable[256] = {
>
> AFAICT, this needs to be "static" and should be "constexpr" to (a)
> guarantee constant initialization and (b) minimize the chances of
> somebody changing it to something that will not be initialized at
> "constant initialization" time. Please correct me if I am wrong.

Aren't variables not declared extern marked static by default?
Sure can do but it should be redundant.

> Please check other tables/globals as well.
>
>
>
>>> One full/stand-alone declaration per line please.
>
>> Ok
>
> These are still merged:

Oh, sorry. I restricted myself to declarations, not definitions.
But you are right, I had missed Rfc3986 declarations as well.
Done.

>> +const CharacterSet
>> +Rfc1738::Unsafe("rfc1738:unsafe", "<>\"# %{}|\\^~[]`'"),
>> +Rfc1738::Ctrls("rfc1738:ctrls", {{0x00, 0x1f}, {0x7f,0xff}}),
>> +Rfc1738::Reserved("rfc1738:reserved", ";/?:@=&"),
>> +Rfc1738::UnsafeAndCtrls = Rfc1738::Unsafe + Rfc1738::Ctrls,
>> +         Rfc1738::Unescaped = (Rfc1738::UnsafeAndCtrls - CharacterSet(nullptr,"%") ).rename("rfc1738:unescaped")
>
>
> Please check other declarations as well.
>
>
>>>> +    // XXX: SBuf lacking reserve(N)
>>>> +    // rv.reserve(s.length()*2); //TODO: optimize arbitrary constant
>>>
>>> AFAICT, SBuf::reserveSpace() should work fine here and in all other
>>> define-SBuf-and-immediately-reserve-space contexts. Am I missing something?
>>
>> API compatiblity; std::string doesn't have reserveSpace but only reserve.
>
> That answer does not compute for me: Why would API compatibility with
> std::string matter when you are not using templates anymore?

This is changed in the current version of the patch in fact.

> Just noticed that this was a private message. Please do not ask for free
> private code reviews unless it is really needed. I still hope that
> others will learn from these emails and not repeat the same problems in
> the future...

It was my mistake in answering; of course I agree that these
discussion should be in public.

-- 
    Francesco
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rfc3986-v3.patch
Type: text/x-diff
Size: 17250 bytes
Desc: not available
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20160316/fdc6f48d/attachment-0001.patch>


More information about the squid-dev mailing list