[squid-dev] [PATCH] Bug 4438 - string pool refactoring

Amos Jeffries squid3 at treenet.co.nz
Thu Mar 31 10:13:59 UTC 2016


On 31/03/2016 7:29 a.m., Alex Rousskov wrote:
> On 03/30/2016 11:15 AM, Amos Jeffries wrote:
>> On 30/03/2016 8:19 a.m., Alex Rousskov wrote:
>>> This committed code does nothing useful: Static variables, such as
>>> "pools" are zero-initialized by default and that happens at constant
>>> initialization time.
> 
> 
>> Not necessarily. That is OS and allocator library dependent behaviour.
> 
> It is guaranteed by the C++ standard. See the "Static initialization"
> section at http://en.cppreference.com/w/cpp/language/initialization
> 

Okay. But forgive me for not trusting to that. I've left the pointers
being zeroed explicitly for now.

> 
>> For exmple, my compiler disagrees with you. use-before-initialization on
>> the return statement for any pool pointer not initialized by Mem::Init()
>> when the memset() is removed.
> 
> AFAICT, either you are compiling some code that is different from the
> one I was talking about or your compiler is buggy: Static PODs *are*
> zero-initialized by default.
> 
> 
>>> The "initialized" flag is useful for the changes in your patch, but the
>>> the comment in the patch would become misleading after the memset() call
>>> is removed because the flag essentially applies to the Mem::Init() call
>>> and not the "pools" variable. In other words, the final code should look
>>> something like this:
>>>
>>>
>>> static MemAllocator &
>>
>> In order to use a reference here MemAllocator will have to forgo
>> constructor parameters, and move its initialization into a virtual
>> method.
> 
> Only if you leave the related bugs intact, which obviously does not make
> sense. GetPool() should be returning a reference to an initialized pool
> object. How that pool was initialized is a separate issue. I do not
> think that individual pool initialization API needs any [drastic]
> changes. What we do need to change is when that initialization happens
> and, possibly, how some of the initialized pools are stored.
> 

All the code currently using GetPool assumes its a pointer. Most of that
code is untouched by me in any of this. While I agree that using a
reference is better, doing the conversion right now is too big a task. I
have been trying various ways to do it between these mails.


> 
>> By returning a reference to pointer here (possibly a nullptr) we retain
>> all the existing behaviour and code throughout Squid creating pools.
> 
> Yes, including the existing or future bugs where that nullptr is
> dereferenced.

Yes. There is a reason GetPool(t) is defined locally inside a file
called "old_api". It and all the functions using its pointers are
deprecated. New code and updated objects should all be made into
MEMPROXY objects.

All those old functions and callers currently assume they are working
with pointers. Some of the things I have been trying to point out to you
are where they rely on that to do things that are not possible with a
reference to pre-initialized object. Such as creating a variable-type
object whose pointer gets stored in the GetPool(t) array _by the caller_.

The amount of change and testing needed to make this a reference like
you want is just too much for an already deprecated allocator design.
Far easier/better to convert to the new API - and none of this helps
with the actual destructor sequence problem in trunk. That was solved a
while back with GetStrPool(sz) fixes (which does use a reference now).


> 
>>>>> AFAICT, this combination will assert if memAllocate() (or any similar
>>>>> external caller) is called "too early".
> 
>>>> That is intentional.
> 
>>> That intention is wrong IMO.
>>>
>>> Why intentionally delay pool initialization?
> 
> 
>> Because only the caller code (same code using memAllocate BTW) knows
>> what the pool parameters need to be initialized to. Including what
>> object type the allocator pointed to will be.
> 
> Either that caller code should be inside Initialize() or it should not
> use the "pools" array and its GetPool() API. Otherwise, we have no
> guarantee that the pool initialization caller will get to that array
> before a pool user will, resulting in the same kind of bugs we are
> fixing now. "Do not return pointers to callers that expect a valid
> object" is a simple/basic design principle. We should not be fighting
> over it! Everything else in this discussion flows from that principle.

The old code using this API is simply that. Old. This is just one of the
ways its "broken" - despite that it all seems to be hanging together
pretty well these past few decades.

I would like to re-code it straight to using MEMPROXY APIs. Not to
recode it to use a "perfect" GetPool() design, only to throw that work
away in a few weeks/months/immediately.


> 
>>>> We can't return a reference here since the initializer functions make
>>>> use of GetPool(t) to locate the array pointer to be set.
>>>
>>> In the fixed code I am talking about, the initialization happens before
>>> GetPool() returns. Thus, GetPool() always returns a ready-to-use pool
>>> and, hence, can return a reference to it.
> 
>> That means every call to GetPools() will have to be instrumented with
>> the text label and size of the object wanted back. Just in case this
>> call was the first ever use of the pool wanted.
> ...
> 
> Not at all. The "pools" array is initialized before first use by
> Initialize() code similar to the current Mem::Init(). Any pool inside
> the old "pools" array that cannot be initialized from Initialize() is
> removed from "pools" (because it is not safe to keep it there) and is
> moved where it can be safely initialized before (or upon) the first use.

What you are describing here sounds almost exactly like the tasks done
when re-factoring classes to use MEMPROXY 'new API'.

I believe we have all been in agreement on moving to that API for some
time and quite a few objects already have done the move in recent years.
Finishing off that conversion is out of scope for this particular change
though. Its looking closer than ever now so I hope to have time to
finish it soon-ish either way.


> 
>> The allocators are statically initialized pointer
>> in per-class new() members, such that the initialization occurs only on
>> first-use. Just not by GetPool() itself - which I think is fine.
> 
> Returning nil from GetPool() is not fine because GetPool() does not know
> whether the caller is going to initialize that pool pointer or
> dereference it. I do not think it is very difficult to adjust the code
> so that we do not have to live with that risk, but you have the right to
> refuse to do so. I am thankful that we stop now and not five iterations
> earlier when the code was in a much worse shape. Somebody can always
> come back to this later, when Squid starts crashing again.


Okay. With the understanding that a) what we have is only a stable
half-way point, and b) the old code is working well enough with pointers
that Squid should not crash for a while.  I am going to merge what has
been done provided it passes some testing and get on with the release
this weekend. Returning to the memory issues afterwards.

Amos



More information about the squid-dev mailing list